ARPA89.TEX[PRO,JMC] - www.SailDart.org

perm filename ARPA89.TEX[PRO,JMC] blob sn#874012 filedate 1989-06-30 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00008 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	%propos[s89,jmc]	Proposal to DARPA for supplementary contract
C00015 00003	\centerline{\bf Biography of John McCarthy}
C00041 00004	{\parindent=3.0cm\centerline{\bf Biography of Vladimir Lifschitz}
C00059 00005	\centerline{\bf Appendix A. Artificial Intelligence, Logic and Formalizing
C00168 00006	\centerline{\bf Appendix B. Overcoming Unexpected Obstacles}
C00178 00007	\centerline{\bf Appendix C. Formal Theories of Action}
C00228 00008	\end
C00229 ENDMK
C⊗;
%propos[s89,jmc]	Proposal to DARPA for supplementary contract
\input memo.tex[let,jmc]
\def\beginlsverbatim{% 
\begingroup
\parindent0pt
\parskip 0pt plus 1pt
\def\par{\leavevmode\endgraf}%
\obeylines%
\obeyspaces%
}
\def\endlsverbatim{\endgroup}
\def\spacer{\hskip 2pt}
{\obeyspaces\global\let =\spacer}
\overfullrule =0pt

\def\pubitm#1#2#3\par{\par\hangafter 1\hangindent 20pt%
    \noindent#1\quad{\bf #2}: \ignorespaces #3}

\centerline{\bf PROPOSAL FOR RESEARCH}
\vskip .1truein
\centerline{\bf IN THE FOUNDATIONS OF ARTIFICIAL INTELLIGENCE}
\bigskip
\bigskip
	This is a proposal for support of research by John
McCarthy
and Vladimir Lifschitz
 in the foundations of artificial intelligence.  The goal
is to lay the scientific foundations for a database of common
sense knowledge that could be used by any AI systems.  The cost
will be \$1,642,802 over a three year period starting in September 1989,
and will support work by McCarthy
and Lifschitz,
two graduate students
%one graduate student
 and will
pay for the acquisition of two SUN-3 work stations.

	The largest part of McCarthy's work since 1956 has been
in the foundations of artificial intelligence, although he has
also worked in developing the concept of time-sharing, in
symbolic computation (LISP) and in the mathematical theory of
computation (proving that computer programs meet their
specifications).  Within artificial intelligence, most of his
work has been in the logical formalization of common sense
knowledge and reasoning, although he has worked on heuristic
programming in the past and plans to develop some ideas in this
area in the coming period.

	McCarthy developed the situation calculus for expressing
the effects of action (McCarthy and Hayes 1969)
and the circumscription method of nonmonotonic
reasoning (McCarthy 1980, 1986).  Circumscription has been taken up by
many competent
researchers, so McCarthy proposes to concentrate on the common
sense knowledge rather than on further development of the
nonmonotonic formalisms.  Specifically, McCarthy plans work on
the following problems in the three year period of the proposal.

	1. Developing formalisms that can be used to express
facts about the common sense world in a way suitable for
inclusion in a database of knowledge of the world that can
be used by whatever AI systems need to know about the world.

	2. Formalizing contexts as objects in a theory.  This
is needed to express context dependent facts in a way that
can be used by AI systems.

	3. Generalizing situation calculus to cover concurrent
events.

	4. Formalizing plans in a way that will permit their
ready modification to deal with unanticipated obstacles.  A
start on this, still unpublished, is included as Appendix B.

	5. Determining how to include universal generalization
in the reasoning done by AI programs.  The problem of the
sterile container, discussed in Appendix A, is an example
of human reasoning using it.  To date AI programs don't do
universal generalization.

	Appendix A,
{\it Artificial Intelligence, Logic and Formalizing Common Sense}
describes the present state of work in the direction of this
proposal.  It will be included in a book edited by Richmond
Thomason about artificial intelligence and philosophical
logic.  One of the goals of the paper is to encourage
philosophical logicians to develop formalisms useful
for AI.

Lifschitz has worked on the mathematical theory of nonmonotonic reasoning,
including its computational aspects, and on applications of nonmonotonic reasoning
to logic programming and to formal reasoning about action and change. He plans
to further investigate the possibilities and limitations of the
situation calculus formalism, including its use for expressing facts about time.

Most applications of the situation calculus discussed so far in the literature
are based on the discrete model of time, but there is no need to introduce
this restriction. For instance, the situation calculus is used in (Lifschitz
and Rabinov, 1988) for describing the process of filling a pool with water,
regulated by opening and closing a valve. We can think of the values of the fluent
$time$ as either integers or reals (or, in a more abstract way, as elements of a
totally ordered group).

Adding the concept of time to the situation calculus allows us to talk about
the duration of actions. Here are possible axioms for durations:
$$duration\;e>0,$$
$$time\;result(e,s) = time\;s + duration\;e.$$
Then we can distinguish between actions $slowmove(x,l)$ and $fastmove(x,l)$,
such that
$$duration\;slowmove(x,l)\;<\;duration\;fastmove(x,l).$$
We can even introduce $move(x,l,t)$, the action of moving $x$ to $l$ within $t$
seconds.

It should be also possible to deal with
delayed effects, as in the sentence: ``30 seconds after you press the button
at the crosswalk, the pedestrian light turns to green'' (the example due to
Yoav Shoham). We will write something like
$$F(s',result(pressbutton,s)) ∧ time\;s' = time\;s+30⊃holds(green,s').$$
Here $F(s1,s2)$ expresses that $s1$ is in the future of $s2$. Perhaps, this can
be taken to mean that there exists a sequence of actions, including, generally,
$wait$ actions of different durations, leading from $s1$ to $s2$.

The assumption that exactly one events occurs at each instant of time
is justifiably perceived as an important limitation of the situation calculus.
Using the function $result$, we can represent two
alternative ``possible futures'' of a situation $s$, corresponding to two
possible actions, by writing $result(a1,s)$ and $result(a2,s)$. But the traditional
syntax provides no symbol for the result of executing the actions $a1$ and $a2$
simultaneously.

Gelfond, Lifschitz and Rabinov (1989) showed how
simple examples of concurrent actions can be expressed without any departure
from the convenient syntax of the situation calculus. The idea is to
treat the simultaneous execution
of $a1$ and $a2$ as another action, $a1+a2$. Then the result of doing $a1$ and
$a2$ concurrently can be written as $result(a1+a2,s)$.
Intuitively, one can think of each action as a set of concurrently performed
``elementary actions.'' Then the sum of actions becomes simply the union of
the corresponding sets. The postulates generalize those of (Lifschitz 1987).
An additional default asserts that the causal
effect of the sum of several actions is assumed to coincide with the union
of the causal effects of the summands. Postulates can be added that override
this default.

Lifschitz plans to expand this formalism in several ways. The first problem is
to relate concurrency to the duration of actions. One possible axiom is
$$duration(a1+a2) = max(duration\;a1,duration\;a2).$$
Furthermore, the use of time will make it possible to distinguish between
strictly concurrent and partially overlapping actions.
\bigskip
\centerline{\bf References}
\bigskip

\noindent
{\bf Gelfond, Michael, V. Lifschitz, and A. Rabinov (1989)}:
{\it A Theory of Concurrent Actions.} In preparation.

\noindent
{\bf Lifschitz, Vladimir (1987)}: ``Formal theories of action'',
in: {\it The Frame Problem in Artificial Intelligence,
Proceedings of the 1987 Workshop}.

\noindent
{\bf Lifschitz, Vladimir and A. Rabinov (1989)}: ``Things that Change by
Themselves'',
in: {\it Proc.~IJCAI-89} (to appear).

\noindent
{\bf McCarthy, John and P.J. Hayes (1969)}:  ``Some Philosophical Problems from
the Standpoint of Artificial Intelligence'', in D. Michie (ed), {\it Machine
Intelligence 4}, American Elsevier, New York, NY.
%  phil.tex[ess,jmc] with slight modifications

\noindent
{\bf McCarthy, John (1980)}: 
``Circumscription - A Form of Non-Monotonic Reasoning'': {\it Artificial
Intelligence}, Volume 13, Numbers 1,2, April.
%  .<<aim 334, circum.new[s79,jmc], cirnew.tex[s79,jmc]>>

\noindent
{\bf McCarthy, John (1986)}:
``Applications of Circumscription to Formalizing Common Sense Knowledge''
{\it Artificial Intelligence}, April.
%  circum.tex[f83,jmc]
\vfill\eject
\centerline{\bf Biography of John McCarthy}
\medskip

\noindent
Biographical sketch:

        John McCarthy is Professor of Computer Science at Stanford
University.  He has been interested in artificial intelligence since
1949 and coined the term in 1955.  His main artificial intelligence
research area has been the formalization of common sense knowledge.
He invented the LISP programming language in 1958, developed the concept of
time-sharing in the late fifties and early sixties, and has worked on
proving that computer programs meet their specifications since the early sixties.
His most recent theoretical development is the circumscription method
of non-monotonic reasoning since 1978.  McCarthy received the A. M. Turing
award of the Association for Computing Machinery in 1971 and was elected
President of the American Association for Artificial Intelligence for
1983-84.  He received the first Research Excellence Award of the
International Joint Conference on Artificial Intelligence in 1985
and the Kyoto Prize of the Inamori Foundation in November 1988.

\bigskip

\beginlsverbatim
BORN:
        September 4, 1927 in Boston, Massachusetts

EDUCATION:
        B.S.  (Mathematics) California Institute of Technology, 1948
        Ph.D. (Mathematics) Princeton University, 1951

HONORS AND SOCIETIES:
        American Academy of Arts and Sciences
        American Society for the Advancement of Science
        American Mathematical Society
        Association for Computing Machinery
        IEEE
        Sigma Xi
        Sloan Fellow in Physical Science, 1957-59
        ACM National Lecturer, 1961
        A. M. Turing Award from Association for Computing Machinery, 1971
        Editorial Board, Artificial Intelligence Journal, 1975 - present
        Academic Advisor, National Legal Center for Public Information, 1976 - 1980
        Board of Directors, Information International, Inc., 1962 - present.
        Board of Directors, Inference Corporation, 1983 - present.
	Board of Directors, Mad Intelligent Systems, 1987 - present.
        Sigma Xi National Lecturer, 1977
        Fellow, Center for Advanced Study in the Behavioral Sciences, 1979 - 1980.
        President, American Association for Artificial Intelligence, 1983-84
	Research Excellence Award, International Joint Conference on Artificial
Intelligence, 1985
	Elected to National Academy of Engineering, 1987
	Kyoto Prize, 1988
	Elected to National Academy of Sciences, 1989

PROFESSIONAL EXPERIENCE:
\nobreak%
        Procter Fellow, Princeton University, 1950-51
        Higgins Research Instructor in Mathematics,
                Princeton University, 1951-53
        Acting Assistant Professor of Mathematics,
                Stanford University, Sept. 1953 - Jan. 1955
        Assistant Professor of Mathematics, Dartmouth
                College, Feb. 1955 - June 1958
        Assistant Professor of Communication Science,
                M.I.T., 1958 - 1961
        Associate Professor of Communication Science,
                M.I.T., 1961 - 1962
        Professor of Computer Science
                Stanford University, 1962 - present
        Director, Artificial Intelligence Laboratory
                Stanford University, 1965 - 1980
	Charles M. Pigott Professor of Engineering, Stanford University, from 1987
	Bobby R. Inman Professor of Computer Science, University of Texas, Fall 1987

PROFESSIONAL RESPONSIBILITIES AND SCIENTIFIC INTERESTS:
\nobreak%
        With Marvin Minsky organized and directed the Artificial
                Intelligence Project at M.I.T.
        Organized and directed Stanford Artificial Intelligence Laboratory
        Developed the LISP programming system for computing with
                symbolic expressions, participated in the development
                of the ALGOL 58 and the ALGOL 60 languages.  Present
                scientific work is in the fields of Artificial
                Intelligence, Computation with Symbolic Expressions,
                Mathematical Theory of Computation, Time-Sharing computer
                systems, formalization of non-monotonic reasoning,
                parallel symbolic computation.

\endlsverbatim
\bigbreak\centerline{\bf Publications}\nobreak\smallskip
\begingroup
\parindent0pt

{\bf McCarthy, John (1951)}: ``Projection Operators and Partial Differential
Equations'' Ph.D.  Thesis, Princeton University.

{\bf McCarthy, John (1952)}: ``A Method for the Calculation of Limit Cycles by
Successive Approximation'' in {\it Contributions to the Theory of Nonlinear
Oscillations II}, Annals of Mathematics Study No. 29, Princeton University,
pp. 75-79.

{\bf McCarthy, John (1953)}: ``An Everywhere Continuous Nowhere Differentiable
Function,'' {\it American Mathematical Monthly}, December 1953, p. 709.

{\bf McCarthy, John (1954)}:  ``A Nuclear Reactor for Rockets'' {\it Jet Propulsion},
January 1954.

{\bf McCarthy, John (1955)}:  ``The Stability of Invariant Manifolds'' Applied
Mathematics Laboratory Technical Report No. 36, Stanford University, 25
pp.

{\bf McCarthy, John (1956)}:  ``The Inversion of Functions Defined by Turing
Machines,'' in {\it Automata Studies, Annals of Mathematical Study No. 34,}
Princeton, pp. 177-181.

{\bf McCarthy, John (1956)}:  ``Aggregation in the Open Leontief Model,''
in Progress Report of Dartmouth Mathematics Project.

{\bf McCarthy, John (1956)}:  ``Measures of the Value of Information,''
{\it Proceedings of the National Academy of Sciences}, September 1956.

{\bf McCarthy, John (1956)}:  Co-editor with Dr. Claude E. Shannon of {\it Automata
Studies}, Annals of Mathematics Study No. 34, Princeton University Press.

{\bf McCarthy, John (1959)}: ``Programs with Common Sense'', in {\it
Proceedings of the Teddington Conference on the Mechanization of
Thought Processes}, Her Majesty's Stationery Office, London.
%  common[e80,jmc],
% common.tex[e80,jmc]

{\bf McCarthy, John (1960)}: ``Recursive Functions of Symbolic Expressions and their
Computation by Machine,'' {\it Comm. ACM}, April 1960.

{\bf McCarthy, John (with 12 others) (1960)} ``ALGOL 60'', {\it Numerische
Mathematik}, March 1960, also in {\it Comm. ACM}, May 1960 and Jan. 1963.

{\bf McCarthy, John (1961)}: ``A Basis for Mathematical Theory of Computation'',
in {\it Proc.  Western Joint Computer Conf.}, May 1961, pp. 225-238.
Later version in Braffort, P. and D. Hirschberg (eds.) {\it Computer
Programming and Formal Systems}, North-Holland Publishing Co. (1963).

{\bf McCarthy, John (1962)}: ``Time-Sharing Computing Systems,'' in {\it Management
and the Computer of the Future}, Martin Greenberger (ed.), MIT Press.

{\bf McCarthy, John (with Paul Abrahams, Daniel Edwards, Timothy
Hart and Michael Levin) (1962)}: 
{\it LISP 1.5 Programmer's Manual}, M.I.T. Press, Cambridge, Mass.

{\bf McCarthy, John (1962)}: ``Computer Programs for Checking Mathematical
Proofs'', {\it Amer. Math. Soc. Proc. of Symposia in Pure Math.}, Vol. 5.

{\bf McCarthy, John (1963)}: ``Towards a Mathematical Theory of Computation'',
in Proc.  IFIP Congress 62, North-Holland, Amsterdam.

{\bf McCarthy, John (1963)}: ``A Basis for a Mathematical Theory of Computation'', 
in P. Braffort and D. Hirschberg (eds.), {\it Computer Programming and
Formal Systems}, North-Holland Publishing Co., Amsterdam, pp. 33-70.

{\bf McCarthy, John (1963)}: ``A Time-Sharing Debugging System for a Small
Computer'', (with Boilen, Fredkin and Licklider), Proc. AFIPS 1963 Spring
Joint Computer Conf., Sparten Books, Detroit, pp. 51-57.

{\bf McCarthy, John (1963)}: ``The Linking Segment Subprogram Language and
Linking Loader Programming Languages'', Comm. ACM, July 1963.  (with F.
Corbato and M. Daggett),

{\bf McCarthy, John (1965)}: ``Problems in the Theory of Computation'', in Proc.
IFIP Congress 65, Spartan, Washington, D.C..

{\bf McCarthy, John  (1966)}: ``A Formal Description of a Subset of Algol'',
{\it Formal Language Description Languages for Computer Programming},
T.B. Steel, Jr. (ed.), North-Holland Publ. Co., Amsterdam, pp. 1-12.

{\bf McCarthy, John (1968}:  ``Time-Sharing Computer Systems'', in
{\it Conversational Computers}, William Orr (ed), Wiley Publishing Company.

{\bf McCarthy, John (1966)}:  ``Information'', {\it Scientific
American}, Vol. 215, No. 9.

{\bf McCarthy, John (1967)}:  ``Correctness of a Compiler for Arithmetic
Expresions'' (with James Painter),
{\it Proceedings of Symposia in Applied Mathematics, Volume XIX},
American Mathematical Society.

{\bf McCarthy, John (1967)}:  ``THOR - A Display Based Time-Sharing System'', 
(with D. Brian, G. Feldman, and John Allen)
{\it AFIPS Conf. Proc.}, Vol. 30, (FJCC) Thompson, Washington, D.C..

{\bf McCarthy, John (1967)}:  ``Computer Control of a Hand and Eye'', in {\it Proc.
Third All-Union Conference on Automatic Control (Technical Cybernetics)},
Nauka, Moscow, (Russian).

{\bf McCarthy, John (1968)}:  ``Programs with Common Sense,'' in M. Minsky (ed.), 
{\it Semantic Information Processing}, M.I.T. Press, Cambridge, Mass.

{\bf McCarthy, John (1968)}:  ``A Computer with Hands, Eyes, and Ears,'' 
(with L. Earnest, D. Reddy, P. Vicens) 
{\it Proc. AFIPS Conf.} (FJCC).

{\bf McCarthy, John and P.J. Hayes (1969)}:  ``Some Philosophical Problems from
the Standpoint of Artificial Intelligence'', in D. Michie (ed), {\it Machine
Intelligence 4}, American Elsevier, New York, NY.
%  phil.tex[ess,jmc] with slight modifications

{\bf McCarthy, John (1972)}:  ``The Home Information Terminal,'' Man and Computer, 
in Proceedings International Conference, Bordeaux 1970, S. Karger, N.Y.

{\bf McCarthy, John (1973)}:  ``Mechanical Servants for Mankind,'' in {\it Britannica 
Yearbook of Science and the Future}.

{\bf McCarthy, John (1974)}:
Book Review: ``Artificial Intelligence: A General Survey'' by Sir James
Lighthill, in {\it Artificial Intelligence}, Vol. 5, No. 3.
% LIGHT.RE5[ESS,JMC] 25-Jul-75	Lighthill review

{\bf McCarthy, John (1974)}:  ``Modeling Our Minds'' in {\it Science Year 1975, The
World Book Science Annual}, Field Enterprises Educational Corporation,
Chicago, ILL.

{\bf McCarthy, John (1976)}:  ``The Home Information Terminal,'' invited presentation, 
AAAS Annual Meeting, Feb. 18-24, 1976, Boston.

{\bf McCarthy, John (1976)}:  ``An Unreasonable Book,'' a review of {\it Computer
Power and Human Reason}, by Joseph Weizenbaum (W.H. Freeman and Co., San
Francisco, 1976) in {\it SIGART Newsletter 58}, June 1976, also in {\it Creative
Computing}, Chestnut Hill, Massachusetts, 1976 and in ``Three Reviews of Joseph
Weizenbaum's {\it Computer Power and Human Reason}'', (with Bruce
Buchanan and Joshua
Lederberg), Stanford Artificial Intelligence Laboratory Memo 291, Computer
Science Department, Stanford, CA.

{\bf McCarthy, John (1977)}:
Review: {\it Computer Power and Human Reason}, by Joseph Weizenbaum (W.H.
Freeman and Co., San Francisco, 1976) in {\it Physics Today}.

{\bf McCarthy, John (1977)}:
``The Home Information Terminal'' {\it Grolier Encyclopedia}.

{\bf McCarthy, John (1977)}:
``On The Model Theory of Knowledge'' (with M. Sato, S. Igarashi, and
T. Hayashi), {\it Proceedings of the Fifth International Joint Conference
on Artificial Intelligence}, M.I.T., Cambridge, Mass.

{\bf McCarthy, John (1977)}:
``Another SAMEFRINGE'', in {\it SIGART Newsletter} No. 61, February 1977.
%  samefr[f76,jmc]

{\bf McCarthy, John (1977)}:
``History of LISP'', in Proceedings of the ACM Conference on the
History of Programming Languages, Los Angeles.
%  lisp[f77,jmc]

{\bf McCarthy, John (1977)}:
``Epistemological Problems of Artificial Intelligence'', {\it Proceedings
of the Fifth International Joint Conference on Artificial 
Intelligence}, M.I.T., Cambridge, Mass.
%  ijcai.c[e77,jmc]

{\bf McCarthy, John (1979)}:
``Ascribing Mental Qualities to Machines'' in {\it Philosophical Perspectives 
in Artificial Intelligence}, Ringle, Martin (ed.), Harvester Press, July 1979.
%  .<<aim 326, MENTAL[F76,JMC],
% mental.tex[f76,jmc]>>

{\bf McCarthy, John (1979)}: 
``First Order Theories of Individual Concepts and Propositions'', 
in Michie, Donald (ed.) {\it Machine Intelligence 9}, (University of
Edinburgh Press, Edinburgh).
%  .<<aim 325, concep.tex[e76,jmc]>>

{\bf Cartwright, Robert and John McCarthy (1979)}:
``Recursive Programs as Functions in a First Order Theory'',
in {\it Proceedings of the International Conference on Mathematical Studies of
Information Processing}, Kyoto, Japan.
%  .<<aim 324, FIRST.NEW[W77,JMC]>>

{\bf McCarthy, John (1980)}: 
``Circumscription - A Form of Non-Monotonic Reasoning'', {\it Artificial
Intelligence}, Volume 13, Numbers 1,2, April.
%  .<<aim 334, circum.new[s79,jmc], cirnew.tex[s79,jmc]>>

{\bf McCarthy, John and Carolyn Talcott (1980)}: {\it LISP - Programming and
Proving}, course notes, Stanford University. (to be published as a book).

%  .<<The references in this bibliography (BIOJMC[1,JMC]) should be in a
%  .uniform style, because I often copy them to papers.  The last few are
%  .correct.  The publication in italics and first names spelled out.
%  .>>

{\bf McCarthy, John (1982)}: ``Common Business Communication Language'', in
{\it Textverarbeitung und B\"urosysteme}, Albert Endres and J\"urgen Reetz, eds.
R. Oldenbourg Verlag, Munich and Vienna 1982.
%  cbcl[f75,jmc]

{\bf McCarthy, John (1982)}: {\it Coloring Maps and the Kowalski Doctrine},
Report No. STAN-CS-82-903, Computer Science Department, Stanford University,
Stanford, CA 94305.
%  maps[e81,jmc]

{\bf McCarthy, John (1983)}: ``AI Needs more Emphasis on Basic Research'', {\it AI
Magazine}, Volume 4, Number 4, Winter 1983.
%  presid.1[f83,jmc]

{\bf McCarthy, John (1983)}: ``Some Expert Systems Need Common Sense'',
in {\it Computer Culture: The Scientific, Intellectual and Social Impact
of the Computer}, Heinz Pagels, ed.
 vol. 426, Annals of the New York Academy of Sciences.
%paper
%presented at New York Academy of Sciences Symposium.
%  common[e83,jmc]
% common.tex[e83,jmc]

{\bf McCarthy, John (1983)}: ``The Little Thoughts of Thinking Machines'',
{\it Psychology Today}, Volume 17, Number 12, December 1983.
%  psycho.4[e83,jmc]

{\bf Gabriel, Richard and  John McCarthy (1984)}:
``Queue-based Multi-processing Lisp''
in {\it Proceedings of the 1984 ACM Symposium on Lisp and Functional Programming},
August 1984.

{\bf McCarthy, John (1986)}:
``Applications of Circumscription to Formalizing Common Sense Knowledge''
{\it Artificial Intelligence}, April 1986.
%  circum.tex[f83,jmc]

{\bf McCarthy, John (1987)}:
``Review of {\it Artificial Intelligence --- The Very Idea} by John
Haugeland'' to appear in {\it SIAM News}.
% haugel[f86,jmc]

{\bf McCarthy, John (1987)}:
``Generality in Artificial Intelligence'', {\it Communications of the ACM}.
Vol. 30, No. 12, pp. 1030-1035
% genera[w86,jmc]

{\bf McCarthy, John (1987)}:
``Mathematical Logic in Artificial Intelligence'', in
 {\it Daedalus}, vol. 117, No. 1, American Academy of Arts and Sciences,
Winter 1988.
% logic.2[w87,jmc]

{\bf McCarthy, John (1988)}:
``Artificial Intelligence'', for {\it Collier's Encyclopedia}.
% artifi.3[w88,jmc]

{\bf McCarthy, John (1989)}:
``First Signs of Solidarity in the Soviet Union'', {\it Communist Economies},
 v. 1, no. 1.
% glasno[w88,jmc]

{\bf McCarthy, John (1989)}:
``Terms for Soviet Access to Western Computer Technology'', to be published
by the Hoover Institution, Stanford.
% soviet[f88,jmc]

{\bf McCarthy, John (1989)}:
``Artificial Intelligence, Logic and Formalizing Common Sense'',
to appear in Thomason, Richmond, ed. {\it Philosophical Logic
and Artificial Intelligence}, Kluwer Academic Publishing,
Dordrecht, Netherlands.
% thomas[f88,jmc],  final printed version in thomas[s89,jmc]

\endgroup
\vfil\eject
{\parindent=3.0cm\centerline{\bf Biography of Vladimir Lifschitz}
\bigskip

\noindent{\bf EDUCATION}
\bigskip

\item{1971$\qquad$} Ph.D.~in Mathematics from the Steklov Mathematical
Institute, Leningrad, USSR.
\smallskip
\item{1968$\qquad$} Degree in Mathematics (B.S., M.S.~equivalent) from
Leningrad University, Leningrad, USSR.
\bigskip
\bigskip

\noindent{\bf RESEARCH AND TEACHING APPOINTMENTS}
\bigskip
\item{1985-now$\qquad$} Senior Research Associate, Computer Science Department,
			Stanford University.
\medskip
\item{1982-84$\qquad$} Associate Professor of Computer Science, University
of Texas at El Paso.
\medskip
\item{1979-82$\qquad$} Assistant Professor, University of Texas at El Paso.
\medskip
\item{1977-79$\qquad$} Visiting Assistant Professor, Brigham Young University.
\medskip
\item{1976-77$\qquad$} Research Associate, Stanford University.
\medskip
\item{1971-74$\qquad$} Research Associate, Institute of Mathematics and Economics,
Leningrad, USSR.
\bigskip
\bigskip

\parindent=20pt

\noindent{\bf MEMBERSHIP IN PROFESSIONAL ORGANIZATIONS}
\bigskip

\noindent Association for Computing Machinery.
\medskip
\noindent American Association for Artificial Intelligence.
\medskip
\noindent Association for Symbolic Logic.
\vfil\eject

\noindent{\bf INVITED PAPERS AT PROFESSIONAL MEETINGS}
\bigskip

\item{1.} ``Calculable Natural Numbers'', Conference on Constructive Mathematics,
New Mexico State University, 1980.
\smallskip
\item{2.} ``Classical Numbers from the Constructive Standpoint'',
L.~E.~J.~Brower Centennial, Netherlands, 1981.
\smallskip
\item{3.} ``Models of Circumscription'',
Workshop on Logic and Computer Science,
University of Kentucky, 1985.
\smallskip
\item{4.} ``On the Semantics of STRIPS'',
Workshop on Planning and Reasoning about Action, Timberline, 1986.
\smallskip
\item{5.} ``On the Declarative Semantics of Logic Programs with Negation'',
Workshop on the Foundations of Logic Programming and Deductive Databases,
Washington, D.C., 1986.
\smallskip
\item{6.} ``Reasoning about Action'', The Third Conference of the Israeli
Association for Artificial Intelligence, Tel-Aviv, Israel, 1987.
\smallskip
\item{7.} ``Negation as Failure and Introspective Reasoning'',
The Third International Symposium on Methodologies for Intelligent
Systems, Torino, Italy, 1988.
\smallskip
\item{8.} ``Mathematical Models of Nonmonotonic Reasoning'', International
Conference on Computer Logic, Tallinn, USSR, 1988.
\smallskip
\item{9.} ``The Mathematics of Nonmonotonic Reasoning'', The Fourth Annual
Symposium on Logic in Computer Science, to be held in Asilomar, California,
in June of 1989.
\smallskip
\item{10.} ``Logical Foundations of Deductive Databases'', The Eleventh World
Computer Congress, to be held in San Francisco, California, in August of 1989.
\bigskip
\bigskip

\noindent{\bf OTHER PROFESSIONAL ACTIVITIES}
\bigskip
\noindent Member of the Editorial Board of the journal {\sl Artificial
Intelligence.}
\medskip
\noindent Member of the Editorial Board of the {\sl Journal of Automated
Reasoning.}
\medskip
\noindent Member of the
Committee on Translations of the Association for Symbolic Logic.
\bigskip
\bigskip
\noindent{\bf PROFESSIONAL AWARDS}
\bigskip\noindent
Publishers' Prize, The Tenth International Joint Conference on Artificial
Intelligence (Milan, 1987).
\bigskip

\noindent{\bf PUBLICATIONS}
\bigskip
\item{1.} ``The normal form for deductions in predicate calculus with equality
and function symbols'',
{\sl Seminars in Mathematics} {\bf 4}, Leningrad: Steklov
Math.~Institute, 1967. {\sl Math.~Rev.}~{\bf 38} $\# 5583$.
\smallskip
\item{2.} ``Certain reduction classes and undecidable theories'',
{\sl Seminars in Mathematics} {\bf 4}, Leningrad: Steklov
Math.~Institute, 1967. {\sl Math.~Rev.}~{\bf 38} $\# 5603$.
\smallskip
\item{3.} ``Deductive validity and reduction classes'',
{\sl Seminars in Mathematics} {\bf 4}, Leningrad: Steklov
Math.~Institute, 1967. {\sl Math.~Rev.}~{\bf 38} $\# 5604$.
\smallskip
\item{4.} ``The decision problem for some constructive theories of equality'',
{\sl Seminars in Mathematics} {\bf 4}, Leningrad: Steklov
Math.~Institute, 1967. {\sl Math.~Rev.}~{\bf 38} $\# 65$.
\smallskip
\item{5.} ``The decision problem for some constructive theories of equality'',
{\sl Seminars in Mathematics} {\bf 4}, Leningrad: Steklov
Math.~Institute, 1967. {\sl Math.~Rev.}~{\bf 38} $\# 4281$.
\smallskip
\item{6.} ``Constructive mathematical theories consistent with classical logic'',
{\sl Proc.~Steklov Math.~Institute} {\bf 98}, 1968. {\sl Math.~Rev.}~{\bf 36}
$\# 6265$.
\smallskip
\item{7.} ``A refinement of the form of deductions in predicate calculus with
equality and function symbols'',
{\sl Proc.~Steklov Math.~Institute} {\bf 98}, 1968. {\sl Math.~Rev.}~{\bf 40}
$\# 5415$.
\smallskip
\item{8.} ``Constructive analytic functions of the real variable'',
{\sl Seminars in Mathematics} {\bf 8}, Leningrad: Steklov
Math.~Institute, 1968. {\sl Math.~Rev.}~{\bf 43} $\# 7323$.
\smallskip
\item{9.} ``On the set of zeroes of a constructive power series in a real
variable'',
{\sl Seminars in Mathematics} {\bf 16}, Leningrad: Steklov
Math.~Institute, 1969. {\sl Math.~Rev.}~{\bf 41} $\# 5193$.
\smallskip
\item{10.} ``The investigation of constructive functions by the method of
fillings'', {\sl Journal of Soviet Math.} {\bf 1}, 1972.
{\sl Math.~Rev.}~{\bf 48} $\# 3714$.
\smallskip
\item{11.} ``A locally analytic constructive function which is not analytic'',
{\sl Soviet Math.~Dokl.}~{\bf 13}, 1972.
{\sl Math.~Rev.}~{\bf 45} $\# 5320$.
\smallskip
\item{12.} ``A non-compact closed sphere in a compact constructive metric space'',
{\sl Seminars in Mathematics} {\bf 32}, Leningrad: Steklov
Math.~Institute, 1972. (With V.~P.~Chernov). 
{\sl Math.~Rev.}~{\bf 49} $\# 1487$.
\smallskip
\item{13.} ``$CT↓0$ is stronger than $CT↓0!$'', {\sl Proc.~Amer.~Math.~Soc.}
{\bf 73}(1), 1979.
\smallskip
\item{14.} ``An intuitionistic definition of classical natural numbers'',
{\sl Proc.~Amer.~Math.~Soc.}
{\bf 77}(3), 1979.
\smallskip
\item{15.} ``Semantical completeness theorems in logic and algebra'',
{\sl Proc.~Amer.~Math.~Soc.}
{\bf 79}(1), 1980.
\smallskip
\item{16.} ``The efficiency of an algorithm of integer programming:
a probabilistic analysis'',
{\sl Proc.~Amer.~Math.~Soc.}
{\bf 79}(1), 1980.
\smallskip
\item{17.} ``The number of increasing subsequences of the random permutation'',
{\sl Journal of Combinatorial Theory}, Ser.~A,
{\bf 31}, 1981. (With B. Pittel).
\smallskip
\item{18.} ``Constructive assertions in an extension of classical mathematics'',
{\sl Journal of Symbolic Logic}
{\bf 47}(2), 1982.
\smallskip
\item{19.} ``A note on the complexity of a partition algorithm'',
{\sl Information Processing Letters}
{\bf 17}, 1983. (With L.~Pesotchinsky).
\smallskip
\item{20.} ``The worst-case performance and the most probable performance of a
class of set-covering algorithms'',
{\sl SIAM Journal on Computing}
{\bf 12}(2), 1983. (With B.~Pittel).
\smallskip
\item{21.} ``On verification of programs with goto statements'',
{\sl Information Processing Letters}
{\bf 18}, 1984.
\smallskip
\item{22.} ``Some results on circumscription'', in:
{\sl Non-Monotonic Reasoning Workshop}, New Paltz, 1984.
\smallskip
\item{23.} ``Calculable natural numbers'', in:
{\sl Intensional Mathematics}, North-Holland, 1985.
\smallskip
\item{24.} ``Computing circumscription'', in:
{\sl Proc.~IJCAI-85}, 1985.
\smallskip
\item{25.} ``Closed-world data bases and circumscription'',
{\sl Artificial Intelligence}, {\bf 27}, 1985.
\smallskip
\item{26.} ``On the satisfiability of circumscription'',
{\sl Artificial Intelligence}, {\bf 28}, 1986.
\smallskip
\item{27.} {\sl Mechanical Theorem Proving in the USSR: Leningrad School},
Delphic Associates, 1986.
\smallskip
\item{28.} ``On the semantics of STRIPS'', in:
{\sl Reasoning about Actions and Plans},
Morgan Kaufmann, Los Atos, California, 1987.
\smallskip
\item{29.} ``Pointwise circumscription: preliminary report'',
in: {\sl Proc.~AAAI-86}, 1986.
\smallskip
\item{30.} ``On the declarative semantics of logic programs with negation'',
in: {\sl Workshop on Foundations of Logic Programming and Deductive Databases},
Washington, D.C., 1986.
\smallskip
\item{31.} ``Formal theories of action: preliminary report'',
in: {\sl Proc.~IJCAI-87}, 1987.
\smallskip
\item{32.} ``Formal theories of action'', in: {\sl The Frame Problem in Artificial
Intelligence, Proceedings of the 1987 Workshop}, 1987.
\smallskip
\item{33.} ``Circumscriptive theories: a logic-based framework for commonsense
knowledge (preliminary report)'', in: {\sl Proc.~AAAI-87}, 1987.
\smallskip
\item{34.} ``Pointwise circumscription'', in: {\sl Readings in Non-Monotonic
Reasoning}, Morgan Kaufmann, Los Atos, California, 1987.
\smallskip
\item{35.} ``Commentary on McDermott'', {\sl Computational Intelligence} {\bf 3},
1987 (with J. McCarthy).
\smallskip
\item{36.} ``Non-monotonic reasoning and causality'', in: {\sl Abstracts of
LMPS-87}, 1987 (with J. McCarthy).
\smallskip
\item{37.} ``The stable model semantics for logic programming'', in: {\sl Logic
Programming: Proc.~5th Int'l.~Conf.~and Symp.}, 1988 (with M. Gelfond).
\smallskip
\item{38.} ``Compiling circumscriptive theories into logic programs: preliminary
report'', in: {\sl Proc.~AAAI-88}, 1988 (with M. Gelfond).
\smallskip
\item{39.} ``Circumscriptive theories: a logic-based framework for commonsense
knowledge)'', {\sl Journal of Philosophical Logic}, {\bf 17}, 1988.
\smallskip
\item{40.} ``Compiling circumscriptive theories into logic programs'', in:
{\sl Non-Monotonic Reasoning} (Lecture Notes in Artificial Intelligence, {\bf 346}),
Springer-Verlag, 1989 (with M. Gelfond).
\smallskip
\item{41.} ``Benchmark problems for formal nonmonotonic reasoning'', in:
{\sl Non-Monotonic Reasoning} (Lecture Notes in Artificial Intelligence, {\bf 346}),
Springer-Verlag, 1989.
\smallskip
\item{42.} ``Miracles in formal theories of action'', {\sl Artificial
Intelligence}, {\bf 38}, 1989 (with A. Rabinov).
\smallskip
\item{43.} ``What is the inverse method?'', {\sl Journal of Automated Reasoning},
{\bf 5}, 1989.
\smallskip
\item{44.} ``Between circumscription and autoepistemic logic'',
in: {\sl Proceedings of the First International Conference on Principles of
Knowledge Representation and Reasoning}, Morgan Kaufmann, Los Altos, California,
1989.
\smallskip
\item{45.} ``Critical issues in nonmonotonic reasoning'',
in: {\sl Proceedings of the First International Conference on Principles of
Knowledge Representation and Reasoning}, Morgan Kaufmann, Los Altos, California,
1989 (with D. Etherington, K. Forbus, M. Ginsberg and D. Israel).
\smallskip
\item{46.} ``Things that change by themselves'', in: {\sl Proc.~IJCAI-89}, 1989
(to appear, with A. Rabinov).
\smallskip
\item{47.} ``The mathematics of nonmononic reasoning'', in: {\sl Proc.~LICS-89},
1989 (to appear).
\smallskip
\item{48.} ``Logical foundations of deductive databases'', in: {\sl Proc.~IFIP
Congress'89}, 1989 (to appear).
\smallskip
}
\vfil\eject
\centerline{\bf Appendix A. Artificial Intelligence, Logic and Formalizing
Common Sense}
\bigskip
\section{Introduction}

	This is a position paper about the relations among
artificial intelligence (AI), mathematical logic and the
formalization of common sense knowledge and reasoning.  It also
treats other problems of concern to both AI and philosophy.  I thank
the editor for inviting it.  The position advocated is that
philosophy can contribute to AI if it treats some of its
traditional subject matter in more detail and that this will
advance the philosophical goals also.  Actual formalisms (mostly
first order languages) for expressing common sense facts are
described in the references.

	Common sense knowledge includes the basic facts about events
(including actions) and their effects, facts about knowledge and how
it is obtained, facts about beliefs and desires.  It also includes
the basic facts about material objects and their properties.

	One path to human-level AI uses mathematical logic to
formalize common sense
knowledge in such a way that common sense problems can be
solved by logical reasoning.  This methodology requires
understanding the common sense world well enough to formalize
facts about it and ways of achieving goals in it.  Basing AI on
understanding the common sense world is different from basing it
on understanding human psychology or neurophysiology.
This approach
to AI, based on logic and computer science, is complementary to
approaches that start from the fact that humans exhibit intelligence,
and that explore human psychology or human neurophysiology.

	This article discusses the problems and difficulties, the
results so far, and some improvements in logic and logical languages
that may be required to formalize common sense.  Fundamental
conceptual advances are almost certainly required.  The object of the
paper is to get more help for AI from philosophical logicians.  Some
of the requested help will be mostly philosophical and some will be
logical.  Likewise the concrete AI approach may fertilize
philosophical logic as physics has repeatedly fertilized mathematics.

	There are three reasons for AI to emphasize common sense
knowledge rather than the knowledge contained in scientific
theories.

	(1) Scientific theories represent compartmentalized
knowledge.  In presenting a scientific theory, as well as in
developing it, there is a commonsense pre-scientific stage.  In
this stage, it is decided or just taken for granted what
phenomena are to be covered and what is the relation between
certain formal terms of the theory and the commonsense world.
Thus in classical mechanics it is decided what kinds of bodies
and forces are to be used before the differential equations are
written down.  In probabilistic theories, the sample space is
determined.  In theories expressed in first order logic, the
predicate and function symbols are decided upon.  The axiomatic
reasoning techniques used in mathematical and logical theories
depend on this having been done.  However, a robot or computer
program with human-level intelligence will have to do this for
itself.  To use science, common sense is required.

	Once developed, a scientific theory remains imbedded
in common sense.  To apply the theory to a specific problem,
commonsense descriptions must be matched to the terms of the theory.
For example, $d = {1\over 2} gt↑2$ does not in itself identify
$d$ as the distance a body falls in time $t$ and identify $g$
as the acceleration due to gravity.  
(McCarthy and Hayes 1969) uses the {\it situation
calculus} introduced in that paper to imbed the above formula
in a formula describing the common sense situation, for example
%
$$\eqalign{
&dropped(x,s) \wedge  height(x,s) = h \wedge  d = {1\over 2}gt↑2 \wedge  d < h\cr
\supset  &\exists s'(F(s,s') \wedge time(s') = time(s)+t
 \wedge  height(x,s') = h - d).}$$
%
Here $x$ is the falling body, and we are presuming a language
in which the functions $height$, $time$, etc. are formalized
in a way that corresponds to what the English words suggest.
$s$ and $s'$ denote {\it situations} as discussed in that paper,
and $F(s,s')$ asserts that the situation $s'$ is in the future
of the situation $s$.

	(2) Commonsense reasoning is required
for solving problems in the common sense world.  From the problem
solving or goal-achieving point of view, the commonsense world is
characterized by a different {\it informatic situation} than that
{\it within} any formal scientific theory.  In the typical common
sense informatic
situation, the reasoner doesn't know what facts are relevant to
solving his problem.  Unanticipated obstacles may arise that involve
using parts of his knowledge not previously thought to be relevant.

	(3) Finally, the informal metatheory of any scientific
theory has a commonsense informatic character.  By this I mean
the thinking about the structure of the theory in general and the
research problems it presents.  Mathematicians invented the
concept of a group in order to make previously vague parallels
between different domains into a precise notion.  The thinking
about how to do this had a commonsense character.

	It might be supposed that the common sense world would admit a
conventional scientific theory, e.g. a probabilistic theory.  But no
one has yet developed such a theory, and AI has taken a somewhat
different course that
involves nonmonotonic extensions to the kind of reasoning used in
formal scientific theories.  This seems likely to work better.

	Aristotle, Leibniz, Boole and Frege all included common sense
knowledge when they discussed formal logic.  However,
formalizing much of common sense knowledge and reasoning proved
elusive, and the twentieth century emphasis has been on formalizing
mathematics.  Some important philosophers, e.g. Wittgenstein, have
claimed that common sense knowledge is unformalizable or mathematical
logic is inappropriate for doing it.  Though it is possible to give a
kind of plausibility to views of this sort, it is much less easy to
make a case for them that is well supported and carefully worked out.
If a common sense reasoning problem is well presented, one is well on
the way to formalizing it.  The examples that are presented for this
negative view borrow much of their plausibility from the inadequacy of
the specific collections of predicates and functions they take into
consideration.  Some of their force comes from not formalizing
nonmonotonic reasoning, and some may be due to lack of logical tools
still to be discovered.  While I acknowledge this opinion, I haven't
the time or the scholarship to deal with the full range of such
arguments.  Instead I will present the positive case, the problems that
have arisen, what has been done and the problems that can be foreseen.
These problems are often more interesting than the ones suggested by
philosophers trying to show the futility of formalizing common sense, and
they suggest productive research programs for both AI and philosophy.

	In so far as the arguments against the formalizability of
common sense attempt to make precise intuitions of their authors,
they can be helpful in identifying problems that have to be solved.
For example, Hubert Dreyfus (1972) said that computers couldn't have
``ambiguity tolerance'' but didn't offer much explanation of the
concept.  With the development of nonmonotonic reasoning, it became
possible to define some forms of {\it ambiguity tolerance} and show
how they can and must be incorporated in computer systems.  For
example, it is possible to make a system that doesn't know about
possible {\it de re}/{\it de dicto} ambiguities and has a
default assumption that amounts to saying that a reference holds
both {\it de re} and {\it de dicto}.  When this assumption
leads to inconsistency, the ambiguity can be discovered and
treated, usually by splitting a concept into two or more.

	If a computer is to store facts about the world and reason
with them, it needs a precise language, and the program has to embody
a precise idea of what reasoning is allowed, i.e. of how new formulas
may be derived from old.  Therefore, it was natural to try to use
mathematical logical languages to express what an intelligent computer
program knows that is relevant to the problems we want it to solve and
to make the program use logical inference in order to decide what to
do.  (McCarthy 1959) contains the first proposals to use logic in AI
for expressing what a program knows and how it should reason.
(Proving logical formulas as a domain for AI had already been
studied by several authors).

	The 1959 paper said:

\begingroup\narrower\narrower
% COMMON.TEX[E80,JMC] TeX version Programs with Common Sense
%
The {\it advice taker} is a proposed program for solving problems by
manipulating sentences in formal languages.  The main difference
between it and other programs or proposed programs for manipulating
formal languages (the {\it Logic Theory Machine} of Newell, Simon and
Shaw and the Geometry Program of Gelernter) is that in the previous
programs the formal system was the subject matter but the heuristics
were all embodied in the program.  In this program the procedures will
be described as much as possible in the language itself and, in
particular, the heuristics are all so described.

	The main advantages we expect the {\it advice taker} to have
is that its behavior will be improvable merely by making statements to
it, telling it about its symbolic environment and what is wanted from
it.  To make these statements will require little if any knowledge of
the program or the previous knowledge of the {\it advice taker}.  One
will be able to assume that the {\it advice taker} will have available
to it a fairly wide class of immediate logical consequences of anything
it is told and its previous knowledge.  This property is expected to
have much in common with what makes us describe certain humans as
having {\it common sense}.  We shall therefore say that {\it a program
has common sense if it automatically deduces for itself a sufficiently
wide class of immediate consequences of anything it is told and what
it already knows.}
\par\endgroup

	The main reasons for using logical sentences extensively in AI
are better understood by researchers today than in 1959.  Expressing
information in declarative sentences is far more modular than
expressing it in segments of computer program or in tables.  Sentences
can be true in much wider contexts than specific programs can be
useful.  The supplier of a fact does not have to understand much about
how the receiver functions, or how or whether the receiver will use it.
The same fact can be used for many purposes, because the logical
consequences of collections of facts can be available.

	The {\it advice taker} prospectus was ambitious in 1959, would
be considered ambitious today and is still far from being immediately
realizable.  This is especially true of the goal of expressing the
heuristics guiding the search for a way to achieve the goal in the
language itself.  The rest of this paper is largely concerned with
describing what progress has been made, what the obstacles are, and
how the prospectus has been modified in the light of what has been
discovered.

	The formalisms of logic have been used to differing
extents in AI.  Most of the uses are much less ambitious than
the proposals of (McCarthy 1959).  We can distinguish four
levels of use of logic.

	1. A machine may use no logical sentences---all its
``beliefs'' being implicit in its state.  Nevertheless, it is often
appropriate to ascribe beliefs and goals to the program, i.e. to
remove the above sanitary quotes, and to use a principle of
rationality---{\it It does what it thinks will achieve its goals}.
Such ascription is discussed from somewhat different points of view
 in (Dennett 1971), (McCarthy 1979a) and
(Newell 1981).  The advantage is that the intent of the machine's
designers and the way it can be expected to behave may be more readily
described {\it intentionally} than by a purely physical description.

	The relation between the physical and the {\it intentional}
descriptions is most readily understood in simple systems that admit
readily understood descriptions of both kinds, e.g. thermostats.  Some
finicky philosophers object to this, contending that unless a system
has a full human mind, it shouldn't be regarded as having any mental
qualities at all.  This is like omitting the numbers 0 and 1 from the
number system on the grounds that numbers aren't required to count
sets with no elements or one element.
Indeed if your main interest is the null set or unit sets, numbers
{\it are} irrelevant.  However, if your interest is the number system
you lose clarity and uniformity
if you omit 0 and 1.  Likewise, when one studies phenomena like belief,
e.g. because one wants a machine with beliefs and which reasons about
beliefs, it works better not to exclude simple cases from the formalism.
One battle has been over whether it should be forbidden to ascribe to a simple
thermostat the belief that the room is too cold.
(McCarthy 1979a) says much more about ascribing mental qualities
to machines, but that's not where the main action is in AI.

	2. The next level of use of logic involves computer programs
that use sentences in machine memory to represent their beliefs but
use other rules than ordinary logical inference to reach conclusions.
New sentences are often obtained from the old ones by ad hoc programs.
Moreover, the sentences that appear in memory belong to a
program-dependent subset of the logical language being used.  Adding
certain true sentences in the language may even spoil the functioning
of the program.  The languages used are often rather unexpressive
compared to first order logic, for example they may not admit
quantified sentences, or they may use a
different notation from that used for ordinary facts to represent
``rules'', i.e.  certain universally quantified implication sentences.
Most often, conditional rules are used in just one
direction, i.e. contrapositive reasoning is not used.  
Usually the program cannot infer new rules; rules
must have all been put in by the ``knowledge engineer''.  Sometimes
programs have this form through mere ignorance, but the usual
reason for the restriction is the practical desire to make the program
run fast and deduce just the kinds of conclusions its designer
anticipates.
  We
believe the need for such specialized inference will turn out to be
temporary and will be reduced or eliminated by improved ways of
controlling general inference, e.g. by allowing the heuristic rules to
be also expressed as sentences as promised in the above extract from
the 1959 paper.

	3. The third level uses first order logic and also logical
deduction.  Typically the sentences are represented as clauses, and the
deduction methods are based on J. Allen Robinson's (1965) method of
resolution.  It is common to use a theorem prover as a problem solver,
i.e.  to determine an $x$ such that $P(x)$ as a byproduct of a proof of
the formula $\exists xP(x)$.
This level is less used for practical
purposes than level two, because techniques for controlling the
reasoning are still insufficiently developed, and it is common for the
program to generate many useless conclusions before reaching the desired
solution.  Indeed, unsuccessful experience (Green 1969) with this method
led to more restricted uses of logic, e.g. the STRIPS system of (Nilsson
and Fikes 1971).
%The promise of (McCarthy 1959) to express the
%heuristic facts that should be used to guide the search as logical
%sentences has not yet been realized by anyone.

	The commercial ``expert system shells'', e.g. ART, KEE and
OPS-5, use logical representation of facts, usually ground facts only,
and separate facts from rules.  They provide elaborate but not always
adequate ways of controlling inference.

	In this connection it is important to mention logic programming,
first introduced in Microplanner (Sussman et al., 1971) 
and from different points of view by Robert Kowalski (1979) and Alain
Colmerauer in the early 1970s.
A recent text is (Sterling and Shapiro 1986).  Microplanner
was a rather unsystematic collection of tools, whereas Prolog relies
almost entirely on one kind of logic programming, but the main idea
is the same.  If one uses a restricted class of sentences, the so-called
Horn clauses, then it is possible to use a restricted form of logical
deduction.  The control problem is then much eased, and it is possible
for the programmer to anticipate the course the deduction will take.
The price paid is that only certain kinds of facts are conveniently
expressed as Horn clauses, and the depth first search built into
Prolog is not always appropriate for the problem.


	Even when the relevant facts can be expressed as Horn
clauses supplemented by negation as failure, the reasoning
carried out by a Prolog program may not be appropriate.  For
example, the fact that a sealed container is sterile if all the
bacteria in it are dead and the fact that heating a can kills a
bacterium in the can are both expressible as Prolog clauses.
However, the resulting program for sterilizing a container will
kill each bacterium individually, because it will have to index
over the bacteria.  It won't reason that heating the can kills
all the bacteria at once, because it doesn't do universal
generalization.

	Here's a Prolog program for testing whether a container
is sterile.  The predicate symbols have obvious meanings.
\smallskip
{\tt\obeyspaces\obeylines
not(P) :- P, !, fail.
not(P).

sterile(X) :- not(nonsterile(X)).
nonsterile(X) :- bacterium(Y), in(Y,X), not(dead(Y)).
hot(Y) :- in(Y,X), hot(X).
dead(Y) :- bacterium(Y), hot(Y).
bacterium(b1).
bacterium(b2).
bacterium(b3).
bacterium(b4).
in(b1,c1).
in(b2,c1).
in(b3,c2).
in(b4,c2).
hot(c1).}
\smallskip
Giving Prolog the goal $sterile(c1)$ and $sterile(c2)$ gives
the answers $yes$ and $no$ respectively.  However, Prolog has
indexed over the bacteria in the containers.

	The following is a Prolog program that can verify whether
a sequence of actions, actually just heating it, will sterilize
a container.  It involves introducing situations analogous to
those discussed in (McCarthy and Hayes 1969).

{\tt\obeyspaces\obeylines
not(P) :- P, !, fail.
not(P).

sterile(X,S) :- not(nonsterile(X,S)).
nonsterile(X,S) :- bacterium(Y), in(Y,X), not(dead(Y,S)).
hot(Y,S) :- in(Y,X), hot(X,S).
dead(Y,S) :- bacterium(Y), hot(Y,S).
bacterium(b1).
bacterium(b2).
bacterium(b3).
bacterium(b4).
in(b1,c1).
in(b2,c1).
in(b3,c2).
in(b4,c2).
hot(C,result(heat(C),S)).}

	When the program is given the goals $sterile(c1,heat(c1,s0))$
and $sterile(c2,\allowbreak heat(c1,s0))$ it answers $yes$ and $no$ respectively.
However, if it is given the goal $sterile(c1,s)$, it will fail because
Prolog lacks what logic programmers call ``constructive negation''.

	The same facts as are used in first Prolog program can be expressed
in a first order language as follows.
%
$$(\forall X)(sterile(X) \equiv  (\forall Y)(bacterium(Y) \wedge  in(Y,X) \supset  dead(Y))),$$

$$(\forall X Y)(hot(X) \wedge  in(Y,X) \supset  hot(Y)),$$

$$(\forall Y)(bacterium(Y) \wedge  hot(Y) \supset  dead(Y)),$$
%
and
%
$$hot(a).$$
%
However, from them we can prove $sterile(a)$ without having
to index over the bacteria.

	Expressibility in Horn clauses, whether supplemented by
negation as failure or not, is an important property of a set of
facts and logic programming has been successfully used for many
applications.  However, it seems unlikely to dominate AI
programming as some of its advocates hope.

	Although  third level systems express both facts and rules
as logical sentences, they are still rather specialized.  The axioms
with which the programs begin are not general truths about the world
but are sentences whose meaning and truth is limited to the narrow
domain in which the program has to act.  For this reason, the ``facts''
of one program usually cannot be used in a database for other programs.

	4. The fourth level is still a goal.  It involves representing
general facts about the world as logical sentences.  Once put in
a database, the facts can be used by any program.  The facts would
have the neutrality of purpose characteristic of much human information.
The supplier of information would not have to understand
the goals of the potential user or how his mind works.  The present
ways of ``teaching'' computer programs by modifying them or
directly modifying their databases amount to ``education
by brain surgery''.

	A key problem for achieving the fourth level is to develop
a language for a general common sense database.  This is difficult,
because the {\it common sense informatic situation} is complex.
Here is a preliminary list of features and
considerations.

	1. Entities of interest are known only partially, and the
information about entities and their relations that may be relevant
to achieving goals cannot be permanently separated from irrelevant
information.  
%
(Contrast this with the situation in gravitational
astronomy in which it is stated in the informal introduction to
a lecture or textbook that
the chemical composition and shape of a body are irrelevant to the
theory; all that counts is the body's mass, and its initial position
and velocity).

	Even within gravitational astronomy, non-equational theories arise
and relevant information may be difficult to determine.  For example, it was
recently proposed that periodic extinctions discovered in the
paleontological record are caused by showers of comets induced by a
companion star to the sun that encounters and disrupts the Oort cloud of
comets every time it comes to perihelion.  This theory is qualitative
because neither the orbit of the hypothetical star nor those of the comets
is available.

	2. The formalism has to be {\it epistemologically adequate},
a notion introduced in (McCarthy and Hayes 1969).  This means that
the formalism must be capable of representing the information that
is actually available, not merely capable of representing actual
complete states of affairs.

	For example, it is insufficient to have a formalism that
can represent the positions and velocities of the particles in a
gas.  We can't obtain that information, our largest computers don't
have the memory to store it even if it were available, and our
fastest computers couldn't use the information to make predictions even
if we could store it.

	As a second example, suppose we need to be able to predict
someone's behavior.  The simplest example is a clerk in a store.
The clerk is a complex individual about whom a customer may know
little.  However, the clerk can usually be counted on to accept
money for articles brought to the counter, wrap them as appropriate
and not protest when the customer then takes the articles from the store.
The clerk can also be counted on to object if the customer attempts
to take the articles without paying the appropriate price.  Describing
this requires a formalism capable of representing information about
human social institutions.  Moreover, the formalism must be capable
of representing partial information about the institution, such as
a three year old's knowledge of store clerks.  For example, a three
year old doesn't know the clerk is an employee or even what that
means.  He doesn't require detailed information about the clerk's
psychology, and anyway this information is not ordinarily available.

	The following sections deal mainly with the advances we see
as required to achieve the fourth level of use of logic in AI.

\section{Formalized Nonmonotonic Reasoning}

	It seems that fourth level systems require extensions
to mathematical logic.  One kind of extension is formalized {\it nonmonotonic
reasoning}, first proposed in the late 1970s (McCarthy 1977, 1980, 1986),
(Reiter 1980), (McDermott and Doyle 1980), (Lifschitz 1989a).
Mathematical logic has been monotonic
in the following sense.  If we have $A \vdash p$ and $A ⊂ B$, then we also
have $B \vdash p$.

	If the inference is logical deduction, then exactly the same
proof that proves $p$ from $A$ will serve as a proof from $B$. If the
inference is model-theoretic, i.e.  $p$ is true in all models of $A$,
then $p$ will be true in all models of $B$, because the models of $B$
will be a subset of the models of $A$.  So we see that the monotonic
character of traditional logic doesn't depend on the details of the
logical system but is quite fundamental.

	While much human reasoning is monotonic,
some important human common sense reasoning is not.  We
reach conclusions from certain premisses that we would not reach if
certain other sentences were included in our premisses.  For example,
if I hire you to build me a bird cage, you conclude that it is appropriate
to put a top on it, but when you learn the further
fact that my bird is a penguin  you no longer draw that
conclusion.  Some people think it is possible to try to save
monotonicity by saying that what was in your mind was not a general rule
about birds flying but a probabilistic rule.  So
far these people have not worked out any detailed
epistemology for this approach, i.e.  exactly what probabilistic
sentences should be used.  Instead AI has moved to directly formalizing
nonmonotonic logical reasoning.  Indeed it seems to me that
when probabilistic reasoning (and not just the axiomatic
basis of probability theory) has been fully formalized, it will
be formally nonmonotonic.

	Nonmonotonic reasoning is an active field of study.
Progress is often driven by examples, e.g. the Yale shooting
problem (Hanks and McDermott 1986), in which obvious
axiomatizations used with the available reasoning formalisms
don't seem to give the answers intuition suggests.  One direction
being explored (Moore 1985, Gelfond 1987, Lifschitz 1989a)
involves putting facts about belief and knowledge explicitly in
the axioms---even when the axioms concern nonmental domains.
Moore's classical example (now 4 years old) is ``If I had an elder
brother I'd know it.''

	Kraus and Perlis (1988) have proposed to divide much nonmonotonic
reasoning into two steps.  The first step uses Perlis's (1988)
autocircumscription to get a second order formula characterizing
what is possible.  The second step involves default reasoning to
choose what is normally to be expected out of the previously established
possibilities.  This seems to be a promising approach.

(Ginsberg 1987) collects the main papers up to 1986.  Lifschitz (1989c)
summarizes some example research problems of nonmonotonic reasoning.
\section{Some Formalizations and their Problems}

	(McCarthy 1986) discusses several formalizations, proposing
those based on nonmonotonic reasoning as improvements of earlier
ones.  Here are some.

	1. Inheritance with exceptions.  Birds normally fly, but there
are exceptions, e.g. ostriches and birds whose feet are encased in
concrete.  The first exception might be listed in advance, but the
second has to be derived or verified when mentioned on the basis of
information about the mechanism of flying and the properties of
concrete.

	There are many ways of nonmonotonically axiomatizing the
facts about which birds can fly.  The following axioms using
a predicate $ab$ standing for ``abnormal'' seem
to me quite straightforward.
%\leql{a4a:}
$$(\forall x)(\neg ab(aspect1(x)) \supset  \neg flies(x)).\leql{aiva}$$
%
Unless an object is abnormal in $aspect1$, it can't fly.

	It wouldn't work to write $ab(x)$ instead of $ab(aspect1(x))$,
because we don't want a bird that is abnormal with respect to its ability
to fly to be automatically abnormal in other respects.  Using aspects limits
the effects of proofs of abnormality.
%leql{a5:}
$$(\forall x)(bird(x) \supset  ab(aspect1(x))).\leql{av}$$
%leql{a6:}
$$(\forall x)(bird(x) \wedge  \neg ab(aspect2(x)) \supset  flies(x))\leql{avi}$$
%
Unless a bird is abnormal in $aspect2$, it can fly.

	When these axioms are combined with other facts about the
problem, the predicate $ab$ is then to be {\it circumscribed}, i.e.
given its minimal extent compatible with the facts being taken
into account.  This has the effect that a bird will be considered
to fly unless other axioms imply that it is abnormal in
$aspect2$. (\eqref{av}) is called a cancellation of inheritance
axiom, because it explicitly cancels the general presumption that
objects don't fly.  This approach works fine when the inheritance
hierarchy is given explicitly.  More elaborate approaches, some
of which are introduced in (McCarthy 1986) and improved in (Haugh
1988), are required when hierarchies with indefinite numbers of
sorts are considered.

	2. (McCarthy 1986) contains a similar treatment of the
effects of actions like moving and painting blocks using the
situation calculus.  Moving and painting are axiomatized entirely
separately, and there are no axioms saying that moving a block
doesn't affect the positions of other blocks or the colors of
blocks.  A general ``common sense law of inertia''
%
$$(\forall  p e s)(holds(p,s) \wedge  \neg ab(aspect1(p,e,s)) \supset  holds(p,result(e,s))),$$
%
asserts that a fact $p$ that holds in a situation $s$ is presumed
to hold in the situation $result(e,s)$ that results from an event
$e$ unless there is evidence to the contrary.  Unfortunately, Lifschitz
(1985 personal communication) and Hanks and McDermott (1986)
showed that simple treatments of the common sense law of inertia
admit unintended models.  Several
 authors have given more elaborate
treatments, but in my opinion, the results are not yet entirely
satisfactory.  The best treatment so far seems to be that of
(Lifschitz 1987).
\section{Ability, Practical Reason and Free Will}

	An AI system capable of achieving goals in the common
sense world will have to reason about what it and other actors
 can and cannot do.
For concreteness, consider a robot that must act in the same
world as people and perform tasks that people give it.  Its need
to reason about its abilities puts the traditional philosophical
problem of free will in the following form.  What view shall we
build into the robot about its own abilities, i.e. how shall we
make it reason about what it can and cannot do?  (Wishing to
avoid begging any questions, by {\it reason} we mean {\it
compute} using axioms, observation sentences, rules of inference
and nonmonotonic rules of conjecture.)

	Let $A$ be a task we want the robot to perform, and let $B$
and $C$ be alternate intermediate goals either of which would
allow the accomplishment of $A$.  We want the robot to be able
to choose between attempting $B$ and attempting $C$.  It would be
silly to program it to reason: ``I'm a robot and a deterministic
device.  Therefore, I have no choice between $B$ and $C$.  What
I will do is determined by my construction.''  Instead it must
decide in some way which of $B$ and $C$ it can accomplish.  It
should be able to conclude in some cases that it can accomplish
$B$ and not $C$, and therefore it should take $B$ as a subgoal
on the way to achieving $A$.  In other cases it should conclude
that it {\it can} accomplish either $B$ or $C$ and should choose
whichever is evaluated as better according to the criteria we
provide it.

	(McCarthy and Hayes 1969) proposes conditions on the
semantics of any formalism within which the robot should reason.
The essential idea is that what the robot can do is determined by
the place the robot occupies in the world---not by its internal
structure.  For example, if a certain sequence of outputs from
the robot will achieve $B$, then we conclude or it concludes that
the robot can achieve $B$ without reasoning about whether the
robot will actually produce that sequence of outputs.

	Our contention is that this is approximately how any
system, whether human or robot, must reason about its ability to
achieve goals.  The basic formalism will be the same, regardless
of whether the system is reasoning about its own abilities
or about those of other systems including people.

	The above-mentioned paper also discusses the complexities
that come up when a strategy is required to achieve the goal and
when internal inhibitions or lack of knowledge have to be taken
into account.
\section{Three Approaches to Knowledge and Belief}

	Our robot will also have to reason about its own knowledge
and that of other robots and people.

	This section contrasts the approaches to knowledge and
belief characteristic of philosophy, philosophical logic and
artificial intelligence.  Knowledge and belief have long been
studied in epistemology, philosophy of mind and in philosophical
logic.  Since about 1960, knowledge and belief have also been
studied in AI.  (Halpern 1986) and (Vardi 1988) contain recent
work, mostly oriented to computer science including AI.

	It seems to me that philosophers have generally treated
knowledge and belief as {\it complete natural kinds}.  According
to this view there is a fact to be discovered about what
beliefs are.  Moreover, once it is decided what the objects of
belief are (e.g. sentences or propositions), the definitions of
belief ought to determine for each such object $p$ whether the
person believes it or not.  This last is the completeness mentioned
above.  Of course, only human and sometimes animal beliefs have
mainly been considered.  Philosophers have differed about whether
machines can ever be said to have beliefs, but even those who admit
the possibility of machine belief consider that what beliefs are
is to be determined by examining human belief.

	The formalization of knowledge and belief has been studied
as part of philosophical logic, certainly since Hintikka's book (1964),
but much of the earlier work in modal logic can be seen as applicable.
Different logics and axioms systems sometimes correspond to the
distinctions that less formal philosophers make, but sometimes the
mathematics dictates different distinctions.

	AI takes a different course because of its different objectives,
but I'm inclined to recommend this course to philosophers also, partly
because we want their help but also because I think it has
philosophical advantages.

	The first question AI asks is: Why study knowledge and belief
at all?  Does a computer program solving problems and achieving goals
in the common sense world require beliefs, and must it use sentences
about beliefs?  The answer to both questions is approximately yes.  At
least there have to be data structures whose usage corresponds closely
to human usage in some cases.  For example, a robot that could use
the American air transportation system has to know that travel agents
know airline schedules, that there is a book (and now a computer
accessible database) called the OAG that contains this information.
If it is to be able to plan a trip with intermediate stops it has
to have the general information that the departure gate from an
intermediate stop is not to be discovered when the trip is first
planned but will be available on arrival at the intermediate stop.
If the robot has to keep secrets, it has to know about how information
can be obtained by inference from other information, i.e. it has
to have some kind of information model of the people from whom
it is to keep the secrets.

	However, none of this tells us that the notions of
knowledge and belief to be built into our computer programs must
correspond to the goals philosophers have been trying to
achieve.  For example, the difficulties involved in building a
system that knows what travel agents know about airline schedules
are not substantially connected with questions about how the
travel agents can be absolutely certain.  Its notion of knowledge
doesn't have to be complete; i.e.  it doesn't have to determine
in all cases whether a person is to be regarded as knowing a
given proposition.  For many tasks it doesn't have to have
opinions about when true belief doesn't constitute knowledge.
The designers of AI systems can try to evade philosophical
puzzles rather than solve them.

	Maybe some people would suppose that if the question of
certainty is avoided, the problems formalizing knowledge and
belief become straightforward.  That has not been our experience.

	As soon as we try to formalize the simplest puzzles involving
knowledge, we encounter difficulties that philosophers have rarely
if ever attacked.

	Consider the following puzzle of Mr.~S and Mr.~P.

	{\it Two numbers $m$ and $n$ are chosen such that $2 \leq  m \leq  n \leq  99$.
Mr.~S is told their sum and Mr.~P is told their product.  The following
dialogue ensues:}

{\obeylines\it
Mr.~P:	I don't know the numbers.

Mr.~S:	I knew you didn't know them.  I don't know them either.

Mr.~P:	Now I know the numbers.

Mr.~S:	Now I know them too.

In view of the above dialogue, what are the numbers?}

	Formalizing the puzzle is discussed in (McCarthy 1989).
For the present we mention only the following aspects.

	1. We need to formalize {\it knowing what}, i.e. knowing what
the numbers are, and not just {\it knowing that}.

	2. We need to be able to express and prove non-knowledge as well as
knowledge.  Specifically we need to be able to express the fact that as
far as Mr.~P knows, the numbers might be any pair of factors of the known
product.

	3. We need to express the joint knowledge of Mr.~S and Mr.~P of
the conditions of the problem.

	4. We need to express the change of knowledge with time, e.g.
how Mr.~P's knowledge changes when he hears Mr.~S say that he knew that
Mr.~P didn't know the numbers and doesn't know them himself.
This includes inferring what Mr.~S and Mr.~P still won't know.

	The first order language used to express the facts of this
problem involves an accessibility relation $A(w1,w2,p,t)$,
modeled on Kripke's semantics for modal logic.  However, the
accessibility relation here is in the language itself rather than
in a metalanguage.  Here $w1$ and $w2$ are possible worlds, $p$
is a person and $t$ is an integer time.  The use of possible
worlds makes it convenient to express non-knowledge.  Assertions
of non-knowledge are expressed as the existence of accessible
worlds satisfying appropriate conditions.

	The problem was successfully expressed in the language
in the sense that an arithmetic condition determining the values
of the two numbers can be deduced from the statement.  However, this
is not good enough for AI.  Namely, we would like to include facts
about knowledge in a general purpose common sense database.  Instead
of an {\it ad hoc} formalization of Mr.~S and Mr.~P, the problem
should be solvable from the same general facts about knowledge that
might be used to reason about the knowledge possessed by travel agents
supplemented only by the facts about the dialogue.  Moreover, the
language of the general purpose database should accommodate all
the modalities that might be wanted and not just knowledge.  This
suggests using ordinary logic, e.g. first order logic, rather than
modal logic, so that the modalities can be ordinary functions or
predicates rather than modal operators.

	Suppose we are successful in developing a ``knowledge formalism''
for our common sense database that enables the program controlling
a robot to solve puzzles and plan trips and do the other tasks that
arise in the common sense environment requiring reasoning about knowledge.
It will surely be asked whether it is really {\it knowledge} that
has been formalized.  I doubt that the question has an answer.
This is perhaps the question of whether knowledge is a natural kind.

	I suppose some philosophers would say that such problems are
not of philosophical interest.  It would be unfortunate, however, if
philosophers were to abandon such a substantial part of epistemology
to computer science.  This is because the analytic skills that
philosophers have acquired are relevant to the problems.


\section{Reifying Context}
%contex[w89,jmc]		Reifying context - for paper for Thomason

	We propose the formula $holds(p,c)$ to assert that the
proposition $p$ holds in context $c$.  It expresses explicitly
how the truth of an assertion depends on context.  The relation
$c1 \leq c2$ asserts that the context $c2$ is more general than
the context $c1$.

	Formalizing common sense reasoning needs contexts as objects,
in order to match human ability to consider context
explicitly.  The proposed database of general common sense knowledge
will make assertions in a general context called $C0$.  However, $C0$
cannot be maximally general, because it will surely involve unstated
presuppositions.  Indeed we claim that there can be no
maximally general context.  Every context involves unstated presuppositions,
both linguistic and factual.

	Sometimes the reasoning system will
have to transcend $C0$, and tools will have to be provided to do
this.  For example, if Boyle's law of the dependence of the volume
of a sample of gas on pressure were built into $C0$, discovery of
its dependence on temperature would have to trigger a process of 
generalization
that might lead to the perfect gas law.

	The following ideas about how the formalization might
proceed are tentative.  Moreover, they appeal to recent logical
innovations in the formalization of nonmonotonic reasoning. In
particular, there
will be nonmonotonic ``inheritance rules'' that allow default
inference from $holds(p,c)$ to $holds(p,c')$, where $c'$ is
either more general or less general than $c$.

	Almost all previous discussion of context has been in
connection with natural language, and the present paper
relies heavily on examples from natural language.  However, I
believe the main AI uses of formalized context will not be in
connection with communication but in connection with reasoning
about the effects of actions directed to achieving goals.  It's
just that natural language examples come to mind more readily.

	As an example of intended usage, consider
%
$$holds(at(he,inside(car)),c17).$$
%
Suppose that this sentence is intended to assert that a
particular person is in a particular car on a particular occasion,
i.e. the sentence is not just being used as a
linguistic example but is meant seriously.  A corresponding
English sentence is ``He's in the car'' where who he is and which
car and when is determined by the context in which the sentence
is uttered.  Suppose, for simplicity, that the sentence is said
by one person to another in a situation in which the car is
visible to the speaker but not to the hearer and the time at
which the the subject is asserted to be in the car is the same
time at which the sentence is uttered.

	In our formal language $c17$ has to carry the information about
who he is, which car and when.

	Now suppose that the same fact is to be conveyed as in
example 1, but the context is a certain Stanford Computer Science
Department 1980s context.  Thus familiarity with cars is
presupposed, but no particular person, car or occasion is
presupposed.  The meanings of certain names is presupposed, however.
We can call that context (say) $c5$.  This more general context requires
a more explicit proposition; thus, we would have
%
$$\displaylines{holds(at(``Timothy McCarthy\mathchar`\",\cr
inside((\iota x)(iscar(x)\wedge 
belongs(x,``John McCarthy\mathchar`\")))),c5).\cr}$$
%
	A yet more general context might not identify a
specific John McCarthy, so that even this more explicit sentence would need
more information.  What would constitute an adequate identification
might also be context dependent.

	Here are some of the properties formalized contexts might have.

	1. In the above example, we will have $c17 \leq  c5$, i.e. $c5$ is
more general than $c17$.
There will be nonmonotonic rules like
%
$$(\forall  c1\ c2\ p)(c1 \leq  c2) \wedge  holds(p,c1) \wedge  \neg ab1(p,c1,c2) \supset  holds(p,c2)$$
%
and
%
$$(\forall  c1\ c2\ p)(c1 \leq  c2) \wedge  holds(p,c2) \wedge  \neg ab2(p,c1,c2) \supset  holds(p,c1).$$
%
% VLADIMIR SUGGESTS THAT THESE ARE INSUFFIENTLY GENERAL
Thus there is nonmonotonic inheritance both up and down in the generality
hierarchy.

	2. There are functions forming new contexts by specialization.
We could have something like
%
$$c19 = specialize({he = Timothy McCarthy, belongs(car, John McCarthy)},c5).$$
We will have $c19 \leq  c5$.

	3. Besides $holds(p,c)$, we may have $value(term,c)$, where
$term$ is a term.  The domain in which $term$ takes values is defined
in some outer context.

	4. Some presuppositions of a context are linguistic and some
are factual.  In the above example, it is a linguistic matter who the
names refer to.  The properties of people and cars are factual, e.g.
it is presumed that people fit into cars.

	5. We may want meanings as abstract objects.  Thus we might
have
%
$$meaning(he,c17) = meaning(``Timothy McCarthy\mathchar`\",c5).$$

	6. Contexts are ``rich'' entities not to be fully described.
Thus the ``normal English language context'' contains factual assumptions
and linguistic conventions that a particular English speaker may not
know.  Moreover, even assumptions and conventions in a context that
may be individually accessible cannot be exhaustively listed.  A person
or machine may know facts about a context without ``knowing the context''.

	7. Contexts should not be confused with the situations of the
situation calculus of (McCarthy and Hayes 1969).  Propositions about
situations can hold in a context.  For example, we may have
%
$$\displaylines{holds(Holds1(at(I,airport),\cr
result(drive-to(airport,result(walk-to(car),S0))),c1).\cr}$$
%
This can be interpreted as asserting that under the assumptions embodied
in context $c1$, a plan of walking to the car and then driving to the airport
would get the robot to the airport starting in situation $S0$.

	8. The context language can be made more like natural
language and more extensible if we introduce notions of entering
and leaving a context.  These will be analogous to the notions
of making and discharging assumptions in natural deduction systems,
but the notion seems to be more general.  Suppose we have $holds(p,c)$.
We then write

\noindent $enter c$.

\noindent This enables us to write $p$ instead of $holds(p,c)$.
If we subsequently infer $q$, we can replace it by $holds(q,c)$
and leave the context $c$.  Then $holds(q,c)$ will itself hold in
the outer context in which $holds(p,c)$ holds.  When a context
is entered, there need to be restrictions analogous to those
that apply in natural deduction when an assumption is made.

	One way in which this notion of entering and leaving
contexts is more general than natural deduction is that formulas like
$holds(p,c1)$ and (say) $holds(not\ p,c2)$ behave differently
from $c1 \supset  p$ and $c2 \supset  \neg p$ which are their natural deduction
analogs.  For example, if $c1$ is associated with the time 5pm
and $c2$ is associated with the time 6pm and $p$ is $at(I, office)$,
then $holds(p,c1) \wedge  holds(not\ p,c2)$ might be used to infer that
I left the office between 5pm and 6pm.  $(c1 \supset  p) \wedge  (c2 \supset  \neg p)$
cannot be used in this way; in fact it is equivalent to
$\neg c1 \vee  \neg c2$.

	9. The expression $Holds(p,c)$ (note the caps) represents
the proposition that $p$ holds in $c$.  Since it is a proposition,
we can assert $holds(Holds(p,c),c')$.

	10. Propositions will be combined by functional analogs of 
the Boolean operators as discussed in (McCarthy 1979b).
Treating propositions involving quantification is
necessary, but it is difficult to determine the right formalization.

	11. The major goals of research into formalizing context
should be to determine the rules that relate contexts to their
generalizations and specializations.  Many of these rules will
involve nonmonotonic reasoning.
\section{Remarks}

	The project of formalizing common sense knowledge and
reasoning raises many new considerations in epistemology and
also in extending logic.  The role that the following ideas
might play is not clear yet.

\noindent Epistemological Adequacy often Requires Approximate Partial Theories

	(McCarthy and Hayes 1969) introduces the notion of epistemological
adequacy of a formalism.  The idea is that the formalism used by
an AI system must be adequate to represent the information that
a person or program with given opportunities to observe can actually
obtain.  Often an epistemologically adequate formalism for some
phenomenon cannot take the form of a classical scientific theory.
I suspect that some people's demand for a classical scientific
theory of certain phenomena leads them to despair about formalization.
Consider a theory of a dynamic phenomenon, i.e. one that changes
in time.  A classical scientific theory represents the state of
the phenomenon in some way and describes how it evolves with time, most
classically by differential equations.

	What can be known about commonsense phenomena usually doesn't
permit such complete theories.  Only certain states permit prediction
of the future.  The phenomenon arises in science and engineering
theories also, but I suspect that philosophy of science sweeps these
cases under the rug.  Here are some examples.

	(1) The theory of linear electrical circuits is complete
within its model of the phenomena.  The theory gives the response
of the circuit to any time varying voltage.  Of course, the theory
may not describe the actual physics, e.g. the current may overheat
the resistors.  However, the theory of sequential digital circuits
is incomplete from the beginning.  Consider a circuit built from
NAND-gates and D flipflops and timed synchronously by an appropriate
clock.  The behavior of a D flipflop is defined by the theory
when one of its inputs is 0 and the other is 1 when the inputs
are appropriately clocked.  However, the behavior is not defined
by the theory when both inputs are 0 or both are 1.  Moreover,
one can easily make circuits in such a way that both
inputs of some flipflop get 0 at some time.

	This lack of definition is not an oversight.  The actual
signals in a digital circuit are not ideal square waves but have
finite rise times and often overshoot their nominal values.
However, the circuit will behave as though the signals were
ideal provided the design rules are obeyed.  Making both
inputs to a flipflop nominally 0 creates a situation in
which no digital theory can describe what happens, because
the behavior then depends on the actual time-varying signals
and on manufacturing variations in the flipflops.

	(2) Thermodynamics is also a partial theory.  It tells
about equilibria and it tells which directions reactions go, but
it says nothing about how fast they go.

	(3) The commonsense database needs a theory of the
behavior of clerks in stores.  This theory should cover
what a clerk will do in response to bringing items to the
counter and in response to a certain class of inquiries.
How he will respond to other behaviors is not defined by
the theory.

	(4) (McCarthy 1979a) refers to a theory of skiing that
might be used by ski instructors.  This theory regards the skier
as a stick figure with movable joints.  It gives the consequences
of moving the joints as it interacts with the shape of the ski
slope, but it says nothing about what causes the joints to be
moved in a particular way.  Its partial character corresponds
to what experience teaches ski instructors.  It often assigns
truth values to counterfactual conditional assertions like, ``If
he had bent his knees more, he wouldn't have fallen''.

\noindent Meta-epistemology
% meta[s88,jmc]		Message to AILIST on metaepistemology
% meta[e85,jmc]		Meta-epistemology
% metaep[f82,jmc]		A proposal for meta-epistemology

	If we are to program a computer to think about its own
methods for gathering information about the world, then it needs
a language for expressing assertions about the relation between
the world, the information gathering methods available to an
information seeker and what it can learn.  This leads to a subject
I like to call meta-epistemology.  Besides its potential applications
to AI, I believe it has applications to philosophy considered in
the traditional sense.

	Meta-epistemology is proposed as a mathematical theory
in analogy to metamathematics.  Metamathematics considers the
mathematical properties of mathematical theories as objects.
In particular model theory as a branch of metamathematics deals
with the relation between theories in a language and interpretations
of the non-logical symbols of the language.  These interpretations
are considered as mathematical objects, and we are only sometimes
interested in a preferred or true interpretation.

	Meta-epistemology considers the relation between the world,
languages for making assertions about the world, notions of what
assertions are considered meaningful, what are accepted as rules
of evidence and what a knowledge seeker can discover about the
world.  All these entities are considered as mathematical objects.
In particular the world is considered as a parameter.
Thus meta-epistemology has the following characteristics.

	1. It is a purely mathematical theory.  Therefore, its
controversies, assuming there are any, will be mathematical
controversies rather than controversies about what the real world
is like.  Indeed metamathematics gave many philosophical issues
in the foundations of mathematics a technical content.  For
example, the theorem that intuitionist arithmetic and Peano
arithmetic are equi-consistent removed at least one area of
controversy between those whose mathematical intuitions support
one view of arithmetic or the other.

	2. While many modern philosophies of science assume some
relation between what is meaningful and what can be verified or
refuted, only special meta-\hfill\break
epistemological systems will have the
corresponding mathematical property that all aspects of the world
relate to the experience of the knowledge seeker.

	This has several important consequences for the task of
programming a knowledge seeker.

	A knowledge seeker should not have a priori prejudices
(principles) about what concepts might be meaningful.  Whether
and how a proposed concept about the world might ever connect
with observation may remain in suspense for a very long time
while the concept is investigated and related to other concepts.

	We illustrate this by a literary example.  Moli\'ere's
play {\it La Malade Imaginaire} includes a doctor who explains
sleeping powders by saying that they contain a ``dormitive
virtue''.  In the play, the doctor is considered a pompous fool
for offering a concept that explains nothing.  However, suppose
the doctor had some intuition that the dormitive virtue might be
extracted and concentrated, say by shaking the powder in a
mixture of ether and water.  Suppose he thought that he would get
the same concentrate from all substances with soporific effect.
He would certainly have a fragment of scientific theory subject
to later verification.  Now suppose less---namely, he only
believes that a common component is behind all substances whose
consumption makes one sleepy but has no idea that he should try
to invent a way of verifying the conjecture.  He still has
something that, if communicated to someone more scientifically
minded, might be useful.  In the play, the doctor obviously sins
intellectually by claiming a hypothesis as certain.  Thus a
knowledge seeker must be able to form new concepts that have only
extremely tenuous relations with their previous linguistic
structure.

\noindent Rich and poor entities

	Consider my next trip to Japan.  Considered as a plan it is
a discrete object with limited detail.  I do not yet even plan to
take a specific flight or to fly on a specific day.  Considered as
a future event, lots of questions may be asked about it.  For example,
it may be asked whether the flight will depart on time and what precisely
I will eat on the airplane.  We propose characterizing the actual trip
as a rich entity and the plan as a poor entity.  Originally, I thought
that rich events referred to the past and poor ones to the future, but
this seems to be wrong.  It's only that when one refers to the past
one is usually referring to a rich entity, while the future entities
one refers to are more often poor.  However, there is no intrinsic
association of this kind.  It seems that planning requires reasoning
about the plan (poor entity) and the event of its execution (rich
entity) and their relations.

	(McCarthy and Hayes 1969) defines situations as rich entities.
However, the actual programs that have been written to reason in
situation calculus might as well regard them as taken from a
finite or countable set of discrete states.

	Possible worlds are also examples of rich entities as
ordinarily used in philosophy.  One never prescribes a possible
world but only describes classes of possible worlds.

	Rich entities are open ended in that we can always introduce
more properties of them into our discussion.  Poor entities can often
be enumerated, e.g. we can often enumerate all the events that we
consider reasonably likely in a situation.  The passage from considering
rich entities in a given discussion to considering poor entities is
a step of nonmonotonic reasoning.

	It seems to me that it is important to get a good formalization
of the relations between corresponding rich and poor entities.
This can be regarded as formalizing the relation between the world
and a formal model of some aspect of the world, e.g. between the
world and a scientific theory.

\section{Acknowledgements}

	I am indebted to Vladimir Lifschitz and Richmond Thomason
for useful suggestions.  Some of the prose is taken from
(McCarthy 1987), but the examples are given more precisely in the
present paper, since {\it Daedalus} allows no formulas.

	The research reported here was partially supported by the
Defense Advanced Research Projects Agency, Contract No. N00039-84-C-0211.
\section{References}

\noindent
{\bf Dennett, D.C. (1971)}: ``Intentional Systems'', {\it Journal of Philosophy}
vol. 68, No. 4, Feb. 25.

\noindent
{\bf Dreyfus, Hubert L. (1972):} {\it What Computers Can't Do:
 the Limits of Artificial Intelligence}, revised edition 1979,
New York : Harper \& Row.

\noindent
{\bf Fikes, R, and Nils Nilsson, (1971)}:
``STRIPS: A New Approach to the Application of 
Theorem Proving to Problem Solving,'' {\it Artificial Intelligence}, Volume 2,
Numbers 3,4, January,
pp. 189-208.

\noindent
{\bf Gelfond, M. (1987)}: ``On Stratified Autoepistemic Theories'',
 {\it AAAI-87} {\bf 1}, 207-211.

\noindent
{\bf Ginsberg, M. (ed.) (1987)}: {\it Readings in Nonmonotonic Reasoning},
Morgan Kaufmann, 481 p.

\noindent
{\bf Green, C., (1969)}:
``Application of Theorem Proving to Problem Solving.'' In IJCAI-1, pp. 219-239.

\noindent
{\bf Halpern, J. (ed.) (1986):}
{\it Reasoning about Knowledge}, Morgan Kaufmann,
Los Altos, CA.

\noindent
{\bf Hanks, S. and D. McDermott (1986)}: ``Default Reasoning, Nonmonotonic
Logics, and the Frame Problem'', in AAAI-86, pp. 328-333.

\noindent
{\bf Haugh, Brian A. (1988)}: ``Tractable Theories of Multiple Defeasible
Inheritance in Ordinary Nonmonotonic Logics'' in {\it Proceedings of the Seventh
National Conference on Artificial Intelligence (AAAI-88)}, Morgan-Kaufman.

\noindent
{\bf Hintikka, Jaakko (1964)}: {\it Knowledge and Belief; an Introduction
 to the Logic of the Two Notions}, Cornell Univ. Press, 179 p.

\noindent
{\bf Kowalski, Robert (1979)}: {\it Logic for Problem Solving},
North-Holland, Amsterdam.

\noindent
{\bf Kraus, Sarit and Donald Perlis (1988)}: ``Names and Non-Monotonicity'',
UMIACS-TR-88-84, CS-TR-2140, Computer Science Technical Report Series,
University of Maryland, College Park, Maryland 20742.

\noindent
{\bf Lifschitz, Vladimir (1987)}: ``Formal theories of action'',
in: {\it The Frame Problem in Artificial Intelligence,
Proceedings of the 1987 Workshop}, reprinted in (Ginsberg 1987).

\noindent
{\bf Lifschitz, Vladimir (1989a)}: {\it Between Circumscription and
Autoepistemic Logic}, to appear in the Proceedings of the First
International Conference on Principles of Knowledge Representation
and Reasoning, Morgan-Kaufman.

\noindent
{\bf Lifschitz, Vladimir (1989b)}: {\it Circumscriptive Theories: A
Logic-based  Framework for Knowledge Representation}, this collection.

\noindent
{\bf Lifschitz, Vladimir (1989c)}: {\it Benchmark Problems for Formal
Nonmonotonic Reasoning}, in {\it Non-Monotonic Reasoning}, 2nd International
Workshop, Grassau, FRG, Springer-Verlag

\noindent
{\bf McCarthy, John (1959)}: ``Programs with Common Sense'', in {\it
Proceedings of the Teddington Conference on the Mechanization of
Thought Processes}, Her Majesty's Stationery Office, London.
%  common[e80,jmc],
% common.tex[e80,jmc]

\noindent
{\bf McCarthy, John and P.J. Hayes (1969)}:  ``Some Philosophical Problems from
the Standpoint of Artificial Intelligence'', in D. Michie (ed), {\it Machine
Intelligence 4}, American Elsevier, New York, NY.
%  phil.tex[ess,jmc] with slight modifications

\noindent
{\bf McCarthy, John (1977)}:
``On The Model Theory of Knowledge'' (with M. Sato, S. Igarashi, and
T. Hayashi), {\it Proceedings of the Fifth International Joint Conference
on Artificial Intelligence}, M.I.T., Cambridge, Mass.

\noindent
{\bf McCarthy, John (1977)}:
``Epistemological Problems of Artificial Intelligence'', {\it Proceedings
of the Fifth International Joint Conference on Artificial 
Intelligence}, M.I.T., Cambridge, Mass.
%  ijcai.c[e77,jmc]

\noindent
{\bf McCarthy, John (1979a)}:
``Ascribing Mental Qualities to Machines'' in {\it Philosophical Perspectives 
in Artificial Intelligence}, Ringle, Martin (ed.), Harvester Press, July 1979.
%  .<<aim 326, MENTAL[F76,JMC],
% mental.tex[f76,jmc]>>

\noindent
{\bf McCarthy, John (1979b)}: 
``First Order Theories of Individual Concepts and Propositions'', 
in Michie, Donald (ed.) {\it Machine Intelligence 9}, (University of
Edinburgh Press, Edinburgh).
%  .<<aim 325, concep.tex[e76,jmc]>>

\noindent
{\bf McCarthy, John (1980)}: 
``Circumscription - A Form of Non-Monotonic Reasoning'', {\it Artificial
Intelligence}, Volume 13, Numbers 1,2, April.
%  .<<aim 334, circum.new[s79,jmc], cirnew.tex[s79,jmc]>>

\noindent
{\bf McCarthy, John (1983)}: ``Some Expert Systems Need Common Sense'',
in {\it Computer Culture: The Scientific, Intellectual and Social Impact
of the Computer}, Heinz Pagels, ed.
 vol. 426, Annals of the New York Academy of Sciences.
%paper
%presented at New York Academy of Sciences Symposium.
%  common[e83,jmc]
%common.tex[e83,jmc]

\noindent
{\bf McCarthy, John (1986)}:
``Applications of Circumscription to Formalizing Common Sense Knowledge''
{\it Artificial Intelligence}, April 1986.
%  circum.tex[f83,jmc]

\noindent
{\bf McCarthy, John (1987)}:
``Mathematical Logic in Artificial Intelligence'', in
 {\it Daedalus}, vol. 117, No. 1, American Academy of Arts and Sciences,
Winter 1988.
% logic.2[w87,jmc]

\noindent
{\bf McCarthy, John (1989)}: ``Two Puzzles Involving Knowledge'' in
{\it Formalizing Common Sense} Ablex 1989.

\noindent
{\bf Moore, R. (1985)}: ``Semantical Considerations on Nonmonotonic Logic'',
 {\it Artificial Intelligence} {\bf 25} (1), 75-94.

\noindent
{\bf Perlis, D. (1988)}: ``Autocircumscription'', {\it Artificial Intelligence},
{\bf 36} pp. 223-236.

\noindent
{\bf Reiter, Raymond (1980)}: ``A Logic for Default Reasoning'', {\it Artificial
Intelligence}, Volume 13, Numbers 1,2, April.

\noindent
{\bf Russell, Bertrand (1913)}: ``On the Notion of Cause'',
{\it Proceedings of the Aristotelian  Society}, 13, pp. 1-26.

\noindent
{\bf McDermott, D. and J. Doyle, (1980)}:
``Non-Monotonic Logic I,'' {\it Artificial Intelligence\/},
Vol. 13, N. 1

\noindent
{\bf Newell, Allen (1981)}: ``The Knowledge Level,'' {\it AI Magazine\/},
Vol. 2, No. 2

\noindent
{\bf Robinson, J. Allen (1965)}: ``A Machine-oriented Logic Based
on the Resolution Principle''. {\it JACM}, 12(1), 23-41.

\noindent
{\bf Sterling, Leon and Ehud Shapiro (1986)}: {\it The Art of Prolog}, MIT Press.

\noindent
{\bf Sussman, Gerald J., Terry Winograd, and 
Eugene Charniak (1971)}: ``Micro-planner Reference Manual,'' Report AIM-203A,
Artificial Intelligence Laboratory, Massachusetts Institute of Technology,
Cambridge.

\noindent
{\bf Vardi, Moshe (1988)}: 
{\it Conference on Theoretical Aspects of Reasoning about Knowledge},
Morgan-Kaufmann, Los Altos, CA.
\vfill\eject
\centerline{\bf Appendix B. Overcoming Unexpected Obstacles}
\bigskip

	As described in Appendix A, there is a major requirement
for successful behavior in the common sense world that isn't met
by any of the formalisms that have been developed in operations
research and other disciplines for optimizing behavior.
Operations Research formalisms are developed by having the
researcher initially delimit the phenomena that are to be taken
into account.  If a fact not taken into account becomes relevant,
it is necessary for a person to revise the formalism.  If
computers are to behave intelligently in the common sense world,
they must be able to take into account facts not previously
regarded as relevant.

	The following set of situation calculus  axioms
gives a simple example of a situation in which a goal can
be achieved by a certain strategy if no unexpected events occur.
Some additional facts are given that allow for an unexpected
event to occur presenting an obstacle.  Still other facts allow
this obstacle to be overcome.  No revision of the facts is
involved, only taking new facts into account.  Therefore, no
human action may be needed to recognize and overcome the
unexpected obstacle.

  The original reasoning that
the goal can be achieved is defeasible, because it is
partly nonmonotonic.  So is the final conclusion that the goal
can be achieved by overcoming the obstacle, because there is
no proof that there can't be additional obstacles.  Of course,
that is what common sense situations are like.

	The situation calculus is introduced in (McCarthy and
Hayes, 1969), and the nonmonotonic formalism used in (McCarthy 1986).
The particular way of handling actions nonmonotonically is adapted
from (Lifschitz 1987), reprinted in (Ginsberg 1987).
The Lifschitz paper included as Appendix C is the best starting point for
understanding the formulas that follow.
All references are included in Appendix A.

	We use $holds(p,s)$ to assert that the proposition $p$
holds in situation $s$.  $precond(p,a)$ asserts that $p$ holding
is a precondition for the successful performance of action $a$.
Also we use $r$ for the function $result$ of previous papers in
order to shorten the formulas.
We take from Lifschitz the definition
%
$$(∀a s)(succeeds(a,s) ≡ (∀p)(precond(p,a) ⊃ holds(p,s))$$
%
and the axioms
%
$$(∀a p s)(succeeds(a,s) ∧ causes(a,p) ⊃ holds(p,r(a,s))),$$
%
and
%
$$(∀a p s)(¬noninertial(p,a) ∧ holds(p,s) ⊃ holds(p,r(a,s))).$$
%
In order to deal with events that are not actions we add to the
Lifschitz formalism the predicate $occurs(e,s)$ meaning that the
event $e$ occurs in the situation $s$.  We extend the
function $r$ to apply to events as well as other actions, so
that $r(e,s)$ denotes the situation that results when event $e$
occurs in situation $s$.  We imagine that a sequence of events
occurs after an action, where this sequence may be empty.  We
have the axioms
%
$$(∀s)((∀e¬occurs(e,s)) ⊃ outcome(s) = s),$$
%
which asserts that when no events occur the situation remains
unchanged.  We also have
%
$$(∀e s)(occurs(e,s) ⊃ outcome(s) = outcome(r(e,s))),$$
%
and we make the definition
%
$$(∀a s)(rr(a,s) = outcome(r(a,s))).$$

	These facts are about the achievement of goals in general
associated with them is a {\it circumscription policy} telling
how the nonmonotonic reasoning is to be done.  In this case,
it is to circumscribe the predicates $precond$, $causes$, $noninertial$
and $occurs$.  Thus actions have only such preconditions as will
follow from the facts given and only those events occur that
are specified, etc.

\bigskip
\noindent An Example

We now come to the particular task---flying from Glasgow to
Moscow via London.  We assume that for flying the actor must
have a ticket and the flights must exist.  The exceptional
possibility is that the ticket is lost in London, and the
solution is to buy another ticket.  That's all there is, but
it carries the essence of the problem of dealing with
unexpected obstacles.  We have the axioms:

$$causes(fly(l1,l2),at(l2))$$
%
$$precond(at(l1),fly(l1,l2))$$
%
$$precond(hasticket,fly(l1,l2))$$
%
$$causes(buyticket,hasticket)$$
%
$$precond(flight(l1,l2),fly(l1,l2))$$
%
$$causes(loseticket, not\,hasticket)$$
%
$$holds(not\, p,s) ≡ ¬holds(p,s)$$
%
$$holds(flight(Glasgow,London),S0)$$
%
$$holds(flight(London,Moscow),S0)$$
%
$$holds(hasticket,S0)$$

We can show
%
$$
\eqalign{
holds(at&(Moscow),\cr
	&rr(fly(London,Moscow),rr(fly(Glasgow,London),S0)))\cr
}$$
%
with the above axioms.  However, if we add to these axioms
%
$$occurs(loseticket,r(fly(Glasgow,London),S0)),$$
%
then we can no longer show that the original strategy works.  But we
can show
%
$$
\eqalign{
holds(at&(Moscow),\cr
	&rr(fly(London,Moscow),
rr(buyticket,rr(fly(Glasgow,London),S0)))).\cr
}$$
%

	We are presuming that the facts about the possibility of buying
a new ticket is in the database of common sense facts.  However, these
facts are not taken into account in the original reasoning.  They need
only be invoked when relevant.  This corresponds to the way people
deal with obstacles and the way computers will have to do so.
Exactly the same treatment will work when an assumption is introduced that
the traveller breaks his leg.  The general facts about getting medical
treatment and acquiring a wheel chair are presumably in the database.
They need only be triggered by the statement that the traveller breaks
his leg in London.

	Actually, providing for breaking a leg gives rise to a more complex
epistemological problem.  Namely, travellers don't really know about
how medical treatment and wheel chairs are obtained.  They only know
that the relevant information can be obtained by appealing to the
airline or airport officials.

\vfill\eject
\centerline{\bf Appendix C. Formal Theories of Action}
\bigskip

We apply circumscription to
formalizing reasoning about the effects of actions
in the framework of the situation calculus. The axiomatic description
of causal connections between actions and changes allows us to
solve the qualification problem and the frame problem using only simple
forms of circumscription. The method is applied to the Hanks---McDermott
shooting problem and to a blocks world in which blocks can be moved and painted.

\footnote\ {This research was partially supported by DARPA under Contract
N0039-82-C-0250.}
\medskip

\noindent{\bf 1. Introduction}
\medskip

We consider the problem of formalizing reasoning about action in the framework of
the {\sl situation calculus} of McCarthy and Hayes (1969).
A {\sl situation} is the complete
state of the universe at an instant of time. A {\sl fluent} is a function defined
on situations. For instance, in the blocks world, the location of a given block
$x$ is a fluent whose values are the possible locations of blocks. In the
language of the situation calculus, the value of this fluent at $s$
for a block $x$ is denoted by
$location(x,s)$. If $s$ is a situation and $a$ an {\sl action}
then $result(a,s)$ stands for the situation that results when the action $a$
is carried out starting in the situation $s$.

We are primarily interested in the form of reasoning
which Hanks and McDermott (1986) call ``temporal projection'':
given some description of the current situation, some description of the
effects of possible actions, and a sequence of actions to be performed,
predict the properties of the world in the resulting situation.

A number of difficulties arise in attempts to
describe the effects of actions in the situation calculus and in similar
formalisms. One of them, known as the {\sl qualification
problem}, is related to the fact that the successful
performance of an action may depend on a large number of qualifications. As a
consequence, the axiom describing the effect of such an action has to include a
long list of assumptions. It is desirable to treat each qualification as a
separate fact and describe it by a separate axiom.

Another difficulty is that, in addition to the
axioms describing how the properties of the current situation
change when an action
is performed, we need a large number of axioms listing the properties which are
{\sl not} affected by a given action. This is called the {\sl frame problem}.
For instance, when a robot is moving a block, appropriate ``frame axioms'' must
guarantee that all the properties of the situation which are not related to the
positions of the robot and the block (for instance, the positions of all other
objects) remain the same.

John McCarthy proposed approaching the qualification problem and the frame
problem using {\sl circumscription} (McCarthy 1980, 1986). We
assume that the reader has some familiarity with this concept;
for the definition, see (McCarthy 1986). With regard to the
qualification problem, the idea is, instead of listing all the qualifications in the
antecedent of one axiom, to condition the axiom on the fact
that (a certain aspect of) the
action is not ``abnormal'' in the
situation in which the action is performed.
For each qualifying condition,
a separate axiom postulates that the condition implies the
abnormality of this aspect of the action.
Circumscribing abnormality should ensure that
the action has its expected effect whenever we are not in any of those exceptional
situations which, according to the axioms, make the action abnormal.

For the frame problem, similarly, McCarthy proposes an axiom which guarantees
that the location of $x$ in the situation $result(a,s)$ is the same as its location
in the situation $s$ unless a certain aspect of $a$ is abnormal. Then
circumscribing abnormality should allow us to prove that $x$ does not
change its position
if the action consists of moving a block other than $x$, or in painting a block,
etc., so that all ``frame axioms'' should become provable.

Further analysis has shown, however, that the result of minimizing abnormality
relative to McCarthy's axiom set is not quite as strong as necessary for the
successful formalization of the blocks world. The difficulty can be illustrated by
the following example (McCarthy 1986). Consider the
situations
%
$$S1=result(move(A,top\;B),S0)$$
%
and
%
$$S2=result(move(B,top\;A),S1),$$
%
where $S0$ is a situation with exactly blocks $A$ and $B$ on the table.
We expect circumscription to guarantee that the first action, $move(A,top\;B)$,
is ``normal'', so that $A$ is on top of $B$ in situation $S1$. In view of
that, the second action is ``abnormal'' (so that the positions of the blocks in
situation $S2$ will be impossible to predict on the basis of McCarthy's
axioms). However, if the first move is ``abnormal'' (for some unspecified
reason), the positions of the blocks after the first action may remain unchanged,
so that the second move may succeed after all. This observation shows that
the intended model of McCarthy's axioms is not the only one in which the set of
abnormal objects
is minimal. In the intended model, the first action is normal, and the
second is not; in the other model, it is the other way around. Using
circumscription, we can only prove the common properties of all minimal models;
minimizing abnormality will only give a disjunction.

Another example of such an ``overweak disjunction'',
for an even simpler system of actions, is discussed in
(Hanks and McDermott 1986). 
There are three possible actions: $load$ (a gun), $wait$, and $shoot$ (an
individual, whose name, according to some accounts, is Fred).
Normally, waiting does not cause any changes in the world, but
shooting leads to Fred's death, provided, of course, that the gun is loaded.
Assume that all three actions are performed, in the given order.
An axiom set similar to McCarthy's axioms for the blocks world is shown to have
an unintended minimal model, in which the gun mysteriously gets unloaded on the
waiting stage, so that the shooting does not kill Fred. In the
intended model, $wait$ does not change the world, but $shoot$ does; in the
unintended model, it is the other way around. In either case, abnormality
is minimal. Here again, the result of circumscription is too weak.

It is easy to notice that, in both examples, there
is a conflict between minimizing abnormality at two instants of time.
In the intended model, minimization
at the earlier instant is preferred. But circumscription knows nothing about time;
it interprets minimality of a predicate as the minimality of its extension
relative to set inclusion. This, according to Hanks and McDermott, is the
reason why circumscription does not live up to our expectations in application
to temporal reasoning:
``$\dots$ the class of models we want our logic to select is not the ``minimal
models'' in the set-inclusion sense of circumscription, but the
``chronologically minimal'' (a term due to Yoav Shoham): those in which
normality assumptions are made in chronological order, from earliest to latest,
or, equivalently, those in which abnormality occurs as late as possible''
(Hanks and McDermott 1986).

The paper by Hanks and McDermott was presented at the AAAI conference in 1986, and
three more talks given at the same conference,
(Kautz 1986), (Lifschitz 1986) and (Shoham 1986), met their challenge
by formalizing, in somewhat different contexts, the idea of chronological
minimization. As could be
expected, all of them had to use forms of logical minimization more general than
circumscription in the sense of (McCarthy 1986).

In this paper we argue that formalizing chronological minimization is not the
easiest way to solve the temporal projection problem. We start with
another view on the difference between the intended and the unintended
models of the axioms for actions: in the intended model, {\sl all changes in the
values of fluents are caused by actions}, whereas in other models this is
not the case. To express this distinction formally, we add new primitive
predicates to the language.\footnote{$↑*$}{An early attempt to describe causation
by axioms was made by Patrick Hayes (1971); see also (McDermott 1982).}
With this different choice of primitive concepts, the effects of actions can
be characterized by simple axioms in the language of the situation calculus
plus traditional circumscription. The axiomatic treatment of causality is the
main distinctive feature of our approach.\footnote{$↑{*}$}{A similar solution
was independently found by Brian Haugh (1987).}

The method is illustrated by formalizing two examples: the shooting problem
and the blocks world. The formalizations are reasonably complete,
in the sense that they allow us to predict the effect of the execution of
any sequence of actions. They are also computationally tractable, in the
sense that, in both cases,
the result of circumscription can be determined by simple
syntactic methods (such as predicate completion). Proofs of these facts will
be given in the full version of the paper (Lifschitz 1987).
\medbreak

\noindent{\bf 2. The Yale Shooting Problem}
\medskip

To formalize the shooting story from the Hanks---McDermott paper,
we use object variables of four sorts: for truth values ($v$), for actions ($a$),
for situations ($s$), and for truth-valued fluents ($f$). The object constants
are: truth values $false$ and $true$, actions $load$, $wait$ and $shoot$,
situation $S0$ and fluents $loaded$ and $alive$.
The binary situation-valued
function $result$ must have an action term and a situation term as arguments.
Finally, there are three predicate constants:
$holds(f,s)$ expresses that $f$ is true in the situation $s$;
$causes(a,f,v)$ expresses that $a$ causes the fluent $f$ to take on the value $v$;
$precond(f,a)$ expresses that $f$ is a precondition for the successful
execution of $a$.\footnote{$↑{**}$}{Notice that, syntactically, the variables
for actions are {\sl object} variables, not function variables, even though
each action $a$ has a function associated with it: the situational fluent
$s \mapsto result(a,s)$. Similarly, the variables for truth-valued fluents are
object variables, not predicate variables. The value of $f$ at $s$ is
represented by $holds(f,s)$.}

Our axioms for the shooting problem can be classified into three groups. The first
group
describes the initial situation:
%
$$holds(alive,S0),\eqno(Y1.1)$$
$$¬ holds(loaded,S0).\eqno(Y1.2)$$
%
The second group tells us how the fluents are affected by actions:
%
$$causes(load,loaded,true),\eqno(Y2.1)$$
$$causes(shoot,loaded,false),\eqno(Y2.2)$$
$$causes(shoot,alive,false).\eqno(Y2.3)$$
%
These axioms describe the effects of 
{\sl successfully performed} actions;
they do not say {\sl when} an action can be successful.
This information is supplied separately:
%
$$precond(loaded,shoot).\eqno(Y2.4)$$
%
The last group consists of two axioms of a more general nature. We use
the abbreviations:
%
$$success(a,s) ≡ ∀f(precond(f,a) ⊃ holds(f,s)),$$
$$affects(a,f,s) ≡ success(a,s) ∧ ∃v\;causes(a,f,v).$$
%
One axiom describes how the value of a fluent changes after an action affecting
this fluent is carried out:
%
$$\eqalign{
success&(a,s) ∧ causes(a,f,v)\cr
		& ⊃ (holds(f,result(a,s)) ≡ v=true).\cr
}\eqno(Y3.1)$$
%
(Recall that $v$ can take on two values here, $true$ and $false$; the
equivalence in the consequent of
$Y3.1$ reduces to $holds(f,result(a,s))$ in the first case and
to the negation of this formula in the second.)
If the fluent is not affected then its value remains the same:
%
$$\eqalign{
¬aff&ects(a,f,s)\cr
	& ⊃ (holds(f,result(a,s)) ≡ holds(f,s)).\cr
}\eqno(Y3.2)$$

This axiom set obviously does not include a number of assumptions that may be
essential. The axioms do not
tell us, for instance, whether $false$ and $true$ are different from
each other and whether there are any truth values other than these two; we
do not know whether $load$, $shoot$ and $wait$ are three different actions, etc.
In this preliminary discussion, instead of making all these assumptions
explicit, we limit our attention to
the {\sl term models} of the axioms,
in which every element of the universe is
represented by a ground term of the language, and different terms represent
different objects. (Such models are also called {\sl Herbrand models} in case of
universal theories).
We will identify the universe of a term model with
the set of ground terms.

It remains to specify how circumscription is used. We circumscribe $causes$ and
$precond$, with the remaining predicate $holds$ allowed to vary, relative to the
conjunction $Y$ of the axioms $Y1.1$---$Y3.2$.
This will lead us to
the conclusion that $causes$ and $precond$ are true only when this is required by
the axioms of the second group, $Y2$.
The minimization of $causes$ will imply, in view of axiom
$Y3.2$, that the only changes taking place in the world are those corresponding to
axioms $Y2.1$---$Y2.3$. This solves the frame problem.
The minimization of $precond$,
in view of the definition of $success$, will imply that the only unsuccessful
actions are those in which precondition $Y2.4$ is violated.
This solves the qualification
problem.

The main technical difference between this formulation and other
attempts to apply circumscription to the situation calculus is that the
predicates which we circumscribe,
$causes$ and $precond$, are {\sl not fluents}, they
do not have situation arguments. The predicate $success$, which does have a
situation argument, is {\sl defined} in terms of the minimized primitive
$precond$. Since the minimized predicates have no situation arguments, the
conflict between minimizing in earlier and later instants of time
simply cannot arise.

It is not difficult to prove that this circumscription
$Y$ has a unique term model $M↓0$. In this sense, it completely determines
whether any given sentence is true or false. In particular, the formula
$$\eqalign{
holds(&alive,\cr
      &result(shoot,result(wait,result(load,S0))))\cr
}$$
is false in $M↓0$.
The predicates $causes$ and $precond$ can be
described in $M↓0$ by explicit definitions:
$$
\eqalign{
causes(a, p, f)≡&[( a=load∧ p=loaded∧ f=true)\cr
		    ∨&( a=shoot∧ p=loaded∧ f=false)\cr
		    ∨&( a=shoot∧ p=alive∧ f=false)],\cr
precond(f, a)≡&( f=loaded∧ a=shoot).\cr
}$$

The definitions can be obtained by applying the
predicate completion procedure to axioms $Y2$. They can be also characterized
as the result of circumscribing the predicates $causes$ and $precond$ relative to
$Y2$. These predicates occur both in $Y2$ and in $Y3$, but we see that
axioms $Y3$ can be ignored when the result of circumscription is determined.


The interpretation of $holds$ in $M↓0$ can be viewed as the description of
the work of a deterministic
finite automaton $\cal A$ with 4 internal states, corresponding to all possible
combinations of values of the fluents $loaded$ and $alive$.
The initial state of $\cal A$ is $(false,true)$. Its
input symbols are $load$, $wait$, $shoot$.
Strings of input symbols are ``plans'', 
and the goal condition of a planning problem would define a set of final states.
For example, the goal $¬alive$ corresponds to selecting
$(false,false)$ and $(true,false)$ as the final states, and
the plan $load$, $wait$, $shoot$ is accepted by
${\cal A}$ with this set of final states.
Each state of ${\cal A}$ corresponds to a set of situations which are
indistinguishable in terms of the fluents $loaded$ and $alive$.
\medbreak

\noindent{\bf 3. An Alternative Formulation}
\medskip

The formalization of the Yale shooting problem proposed above has two defects.
First, we were only able to show that actions lead to the expected results in
{\sl term models} of the circumscription,
not in arbitrary models. Second, we exploited
some special features of the problem that are not present in more complex
problems of this kind. Now we will consider an improved, though somewhat less
economical, formalization of the shooting story. Here are the main differences
between the two versions.

1. Since it is essential in the first solution that, in a term model, different
ground terms have different values, we will need some {\sl uniqueness of names}
axioms, expressing this property.
The following notation will be useful: if $c↓1,\dots,c↓n$ are
constants of the same sort then $U[c↓1,\dots,c↓n]$ stands for the set of
axioms $c↓i≠c↓j$ $(1≤i<j≤n)$. For instance, $U[load,wait,shoot]$ is
$$load≠wait,\;load≠shoot,\;wait≠shoot.$$

2. It is also essential in the first solution that every situation in the term 
model is 
the result of executing a sequence of actions in $S0$.\footnote{$↑*$}{This
fact allows us to define the interpretation of $holds$ in $M↓0$
by recursion on the
second argument.} We see several ways to deal with this problem, all
of them, admittedly, somewhat unintuitive. In the new solution we choose
to restrict axioms $Y3$ to a subclass of
{\sl relevant} situations; we will postulate that $S0$ is relevant and that
the result of executing an action in a relevant situation is relevant also.

3. The only precondition in the Yale shooting problem (given by axiom $Y2.4$)
is extremely simple: $loaded$ is one of the ``coordinate fluents'' defining the
state of the corresponding automaton (see the end of Section 2). Even in the
blocks world, we need preconditions of a more complex nature, such as 
$clear\;l$ and $clear\;top\;b$
for the action $move(b,l)$, where
$clear$ is explicitly defined in terms of primitive fluents.
Accordingly, in the second solution we explicitly distinguish between
fluents in the general sense and primitive fluents. Only a
primitive fluent can occur as the second argument of $causes$,
but preconditions do not have to be primitive.

4. The value which $a$ causes $f$ to take on may depend on the situation in
which $a$ is executed. For instance, toggling a switch may cause its status to
change in two different ways, depending on its current
status.\footnote{$↑{**}$}{This problem
was pointed out by Michael Georgeff.} 
Accordingly, we will allow a truth-valued fluent (not necessarily
primitive), rather than a truth value, to be the last argument of $causes$; the
value of that fluent in the situation $s$ is the value that $f$ will take on
in the situation $result(a,s)$, provided $a$ is successful. The effect of
toggling a switch can be described then by an axiom such as
$$causes(toggle,\,on,\,not\;on)$$
($not$ is the function which transforms a fluent into its negation).

In the new formalization of the Yale shooting
we have an additional sort of variables, primitive
fluents. On the other hand, we do not need variables for truth values
any more. Thus the language has variables for actions ($a$), for situations ($s$),
for truth-valued fluents ($f$), and for primitive truth-valued fluents ($p$).
This last sort is considered a subtype of fluents, so that a term
representing a primitive fluent can be used wherever the syntax requires
a fluent. (Accordingly, in a structure for the language the domain of
primitive fluents must be a subset of the domain of fluents).

The object constants are the same as in
the language of Section 2. As before, $load$, $wait$
and $shoot$ are action constants and $S0$ a
situation constant; $loaded$ and $alive$ are primitive fluents and
$false$ and $true$ are fluents (intuitively, the identically false fluent
and the identically true fluent). The only function constant, $result$, 
and the predicate constants $holds$ and $precond$ are used
as in the first formulation. The arguments of $causes$ must be an action, a
primitive fluent and a fluent. In addition, we need the unary predicate
$relevant$, whose argument must be a situation term.

%The axiom set contains several postulates which
%have no counterparts in the first formulation. First, we need these
%uniqueness of names axioms:
%

The axioms of the first two groups, $Y1$ and $Y2$ (Section 2) are included
in the new formulation without
change. (It is easy to see that all of them are well-formed formulas of the
new language). The
definitions of $success$ and $affects$ remain the same, except that variables
of different sorts are required now in the definition of $affects$:
$$affects(a,p,s) ≡ success(a,s) ∧ ∃f\;causes(a,p,f).$$
The counterpart of $Y3.1$ is the following {\sl Law of Change:}
%
$$\eqalign{
rele&vant\;s ∧ success(a,s) ∧ causes(a,p,f)\cr
			   &⊃ (holds(p,result(a,s)) ≡ holds(f,s)).\cr
}\eqno(LC)$$
%
$Y3.2$ becomes the {\sl Commonsense Law of Inertia:}
$$\eqalign{
rele&vant\;s ∧ ¬affects(a,p,s)\cr
			   & ⊃ (holds(p,result(a,s)) ≡ holds(p,s)).\cr
}\eqno(LI)$$

In addition, we need a few axioms which have no counterparts in the first
version. Let $Act$ be the list of actions $load$, $wait$, $shoot$, and let $Fl$
be the list of fluents
$$loaded,\;alive,\;false,\;true.$$
We need the following uniqueness of names axioms:
%
$$U[Act],\eqno(U1)$$
$$U[Fl],\eqno(U2)$$
$$\eqalign{
result(a1,s1)=res&ult(a2,s2)\cr
                 &⊃a1=a2∧s1=s2.\cr
}\eqno(U3)$$
%

Axiom $U3$ represents the assumption that a situation includes a
complete description of its history, i.e., uniquely defines what action has
led to it and in what situation the action was performed.

We also need to assume that the non-primitive fluent constants
$$false, true$$
denote objects which do not belong to the subdomain of primitive fluents.
If this list is denoted by $NPF$ then this assumption can be expressed by
$$p≠{\bf t}\qquad({\bf t}εNPF).\eqno(U4)$$

The next group of axioms consists of the
{\sl fluent definitions} for the non-primitive fluents:
%
$$holds(true,s),\eqno(FD.true)$$
$$¬holds(false,s).\eqno(FD.false)$$
Finally, there are 3 axioms for the new predicate $relevant$:
$$relevant\;S0,\eqno(R1)$$
$$relevant\;s⊃relevant\;result(a,s),\eqno(R2)$$
$$result(a,s)=S0⊃¬relevant\;s.\eqno(R3)$$
Axioms $R1$, $R2$ imply
$$relevant\;{\bf s}$$
for any situation term {\bf s} without variables.
Axiom $R3$ guarantees, in the presence of $R1$ and $R2$, that a sequence of
actions cannot possibly lead from $S0$ back to $S0$ (i.e., $S0$ is not ``in
the future'' of $S0$).

By $Y'$ we denote the conjunction of the axioms $Y1$, $Y2$, $LC$, $LI$, $U$,
$FD$, $R$.
The predicates $causes$ and $precond$ are circumscribed now
relative to $Y'$, with the
remaining predicates $holds$ and $relevant$ allowed to vary. 
This circumscription
has many non-isomorphic models. But in all of them
the predicates $causes$ and $precond$ satisfy the following
conditions (essentially identical to their definitions
in the model $M↓0$ of $Y$):
$$\eqalign{
causes(a, p,& f)≡( a=load∧ p=loaded∧ f=true)\cr
		    ∨&( a=shoot∧ p=loaded∧ f=false)\cr
		    ∨&( a=shoot∧ p=alive∧ f=false),\cr
precond( f,& a)≡( f=loaded∧ a=shoot).\cr
}$$
Moreover, every ground atom is either true in all models of this
circumscription, or false in all of them.
Thus our ``circumscriptive theory'' is sufficiently strong
to decide any sentence without variables.
\medbreak

\noindent{\bf 4. The Blocks World}
\medskip

Now we are ready to consider a slightly larger example: a blocks world in
which blocks can be moved and painted, as in (McCarthy 1986). Even though
the object domain is now entirely different, the
primitive concepts $result$, $holds$, $causes$ and $precond$ will be used
again, and some
axioms (notably, the law of change and the law of inertia) will be included
in the axiom set for the blocks world in exactly the same form. These axioms
represent the general properties of actions which are likely to be useful
for constructing axiomatic descriptions of many different object domains.

As before, we have variables for
actions ($a$), for situations ($s$), for truth-valued fluents ($f$),
and for primitive truth-valued fluents ($p$), the last sort being again a
subtype of fluents. In addition, there are 3 ``domain-specific'' sorts:
blocks ($b$), locations ($l$) and colors ($c$). Object constants:
$S↓0$, $true$, $false$, blocks $Block↓1$,
$\dots$, $Block↓N$ (where $N$ is a fixed positive integer), location $Table$,
and, finally, colors $Red$, $White$ and $Blue$. Here is the list of
function constants, along with the sorts of their arguments and values:

\medskip
\+&$top$	&$(b→l)$,\cr
\+&$at$		&$(b,l→p)$,\cr
\+&$color$	&$(b,c→p)$,\cr
\+&$clear$	&$(l→f)$,\cr
\+&$move$	&$(b,l→a)$,\cr
\+&$paint$	&$(b,c→a)$,\cr
\+&$result$	&$(a,s→s)$.\cr

\medskip\noindent
The predicate constants are the same as in Section 3: $holds$, $causes$,
$precond$ and $relevant$.

The first group of axioms contains some general facts about blocks, their locations
and colors:
$$b=Block↓1∨\dots∨b=Block↓N,\eqno(B0.1)$$
$$holds(at(b,l1),s)∧holds(at(b,l2),s)⊃l1=l2,\eqno(B0.2)$$
$$\eqalign{
holds(color(b,c1),s)∧holds(color(&b,c2),s)\cr
				     &⊃c1=c2.\cr
}\eqno(B0.3)$$
Initially, all blocks are white and on the table:
$$holds(at(b,Table),S0),\eqno(B1.1)$$
$$holds(color(b,White),S0).\eqno(B1.2)$$
The effects of actions:
$$causes(move(b,l),at(b,l),true),\eqno(B2.1)$$
$$l≠l1⊃causes(move(b,l),at(b,l1),false),\eqno(B2.2)$$
$$causes(paint(b,c),color(b,c),true),\eqno(B2.3)$$
$$c≠c1⊃causes(paint(b,c),color(b,c1),false),\eqno(B2.4)$$
$$l≠Table⊃precond(clear\;l,move(b,l)),\eqno(B2.5)$$
$$precond(clear\;top\;b,move(b,l)),\eqno(B2.6)$$
$$precond(false,move(b,top\;b)).\eqno(B2.7)$$
(It is impossible to move a block onto its own top).
Axioms $B2.2$ and $B2.4$ may seem redundant: we already know from axioms
$B0.2$ and $B0.3$ that a block cannot possibly be at two places or have two
colors simultaneously. But our method requires
that all changes of all primitive fluents caused by an action be described
explicitly. It does not solve the ``ramification problem'' in the sense of
(Finger 1986) and (Ginsberg and Smith 1986).\footnote{$↑*$}{I am indebted to
Matthew Ginsberg for this important observation.}

The laws of change and inertia $LC$ and $LI$ are formulated as in $Y'$ above.
To state the counterparts of the uniqueness of names axioms from $Y'$,
we define $Act$ and $Fl$ for the blocks world language as follows:

$$Act=(move, paint),$$
$$Fl=(at,color,true,false,clear).$$
%
Now the list $Act$ consists of the symbols for {\sl functions} returning actions,
not of object constants, as in $Y'$. $Fl$ contains both object constants and
function constants.
We will extend the definition of $U[\dots]$ to the case when some or all
of its arguments are functions. We can treat constants as 0-ary
functions and thus define the meaning of $U[f↓1,\dots,f↓n]$, where $f↓1$,
$\dots$, $f↓n$ are functions returning values of the same sort. By definition,
this stands for the following set of axioms:
%
$$f↓ix↓1\dots x↓k≠f↓jy↓1\dots y↓l$$
for all $i<j$, and
$$f↓ix↓1\dots x↓k=f↓iy↓1\dots y↓k⊃(x↓1=y↓1∧\dots x↓k=y↓k),$$
for all $f↓i$ of arity $k>0$, where $x↓1$, $\dots$, $y↓1$, $\dots$ are
variables of appropriate sorts. These axioms express that $f↓1$, $\dots$, $f↓n$ are
injections with disjoint ranges. If each $f↓i$ is an object constant then we
get the special case defined in Section 3 above. Example: Axiom $U3$ can be written
in this notation as $U[result]$.

$NPF$ is defined as the list of terms 
$$false,\;true,\;clear\;l.$$
Now axioms $U1$---$U4$ are expressed exactly as in Section 3. In addition, we
have 3 domain-specific uniqueness axioms:
$$\eqalignno{
&U[B↓1,\dots,B↓N],&(U.block)\cr
&\;U[Table,top],&(U.location)\cr
U[&Red,White,Blue].&(U.color)}
$$
The fluent definitions include, in addition to $FD.true$ and $FD.false$,
the definition of $clear$:
%
$$holds(clear\;l,s)≡∀b¬holds(at(b,l),s).\eqno(FD.clear)$$
%
Finally, the axioms $R$ are formulated as in $Y'$.

By $B$ we denote the conjunction of all these axioms. As before,
we circumscribe $causes$ and $precond$, with $holds$ and $relevant$
allowed to vary. It turns out that in each model of this circumscription
the minimized predicates satisfy the formulas obtained by ``completing''
axioms $B2$: $causes(a,p,f)$ is equivalent to
$$\eqalign{
∃b,&l,l1,c,c1[(a=move(b,l)∧p=at(b,l)∧f=true)\cr
    ∨&(a=move(b,l)∧move(b,l1)∧l≠l1∧f=false)\cr
    ∨&(a=paint(b,c)∧p=color(b,c2)∧f=true)\cr
    ∨&(a=paint(b,c)∧move(b,c1)∧c≠c1∧f=false)],\cr
}$$
and $precond(f,a)$ is equivalent to
$$\eqalign{
∃b,l[a=&move(b,l)∧((f=clear\;l∧l≠Table)\cr
          &∨f=clear\;top\;b∨(l=top\;b∧f=false))].\cr
}$$
Every ground atom is either true in all models of this circumscription, or false
in all of them, so that this ``circumscriptive theory'', like the theory from
Section 3, is complete on the level of atomic formulas.
\medbreak

\noindent{\bf Acknowledgements}
\medskip

This paper is an outgrowth of ideas developed by John
McCarthy in his research on commonsense and non-monotonic reasoning, and I am
grateful for the opportunity to discuss my work with him.
I have also benefitted from discussions with David Etherington,
Michael Gelfond, Michael Georgeff, Matthew Ginsberg, Robert Givan, Brian Haugh,
Patrick Hayes, Robert Kowalski, Drew McDermott, Nils Nilsson, Donald Perlis,
Raymond Reiter and Yoav Shoham.
\medbreak

\noindent{\bf References}
\medskip

\noindent
\item{1.}
Finger, J., {\sl Exploiting Constraints in Design Synthesis}, PhD thesis,
Stanford University, Stanford, CA, 1986.

\noindent
\item{2.}
Ginsberg, M., and Smith, D., Reasoning about action I: a possible world approach,
{\sl Proc. Workshop on the Frame Problem in Artificial Intelligence}, Morgan
Kauffman, 1987 (to appear).

\noindent
\item{3.}
Hanks, S. and McDermott, D., Default reasoning, nonmonotonic logics, and the
frame problem,
{\sl Proc. AAAI-86} {\bf 1}, 1986, 328-333.

\noindent
\item{4.}
Haugh, B., Simple causal minimizations for temporal persistence and projection,
{\sl Proc. AAAI-87} (to appear).

\noindent
\item{5.}
Hayes, P., A logic of actions, in:
Meltzer, B. and Michie, D. (Eds.),
{\sl Machine Intelligence} {\bf 6}, Edinburgh University Press,
Edinburgh, 1971, 495-520.

\noindent
\item{6.}
Kautz, H., The logic of persistence,
{\sl Proc. AAAI-86} {\bf 1}, 1986, 401-405.

\noindent
\item{7.}
Lifschitz, V., Pointwise circumscription: preliminary report,
{\sl Proc. AAAI-86} {\bf 1}, 1986, 406-410.

\noindent
\item{8.}
Lifschitz, V., Formal theories of action,
{\sl Proc. Workshop on the Frame Problem in Artificial Intelligence}, Morgan
Kauffman, 1987 (to appear).

\noindent
\item{9.}
McCarthy, J., Circumscription --- a form of non-monotonic reasoning,
{\sl Artificial Intelligence} {\bf 13} (1980), 27-39.

\noindent
\item{10.}
McCarthy, J., Applications of circumscription to formalizing commonsense
knowledge, {\sl Artificial Intelligence} {\bf 28} (1986), 89-118.

\noindent
\item{12.}
McCarthy, J. and Hayes, P., Some philosophical problems from the
standpoint of artificial intelligence, in:
Meltzer, B. and Michie, D. (Eds.),
{\sl Machine Intelligence} {\bf 4}, Edinburgh University Press,
Edinburgh, 1969, 463-502.

\noindent
\item{13.}
McDermott, D., A temporal logic for reasoning about processes and plans,
{\sl Cognitive Science} {\bf 6} (1982), 101-155.

\noindent
\item{14.}
Shoham, Y.,
Chronological ignorance: time, nonmonotonicity, necessity and causal theories,
{\sl Proc. AAAI-86} {\bf 1}, 1986, 389-393.
\vfil\eject
\end